A C++ Matrix Template Class

, April 01, 1997

April 1997/A C++ Matrix Template Class

A C++ Matrix Template Class

Robert Mashlan

It seldom makes sense to overload all those arithmetic operators on a class you define, but matrix arithmetic really benefits from this powerful aspect of C++.

The C++ language contains some powerful features that simplify expressing many mathematical abstractions. For instance, operator overloading allows you to express mathematical operations that can logically be applied to user-defined types. The addition of templates to the language allows you to further generalize classes to work with similar types. In this article, I present a matrix template class which uses both features of the language.

A matrix is a mathematical abstraction that represents a two-dimensional array of elements. Usually, a matrix might use real numbers for elements, but a matrix does not need to be limited to these type of elements. For instance, you can use complex numbers for elements. In fact, for the algebraic operations on a matrix, any type may be used that has an algebra with the following rules:

1) The type supports a commutative addition, that is A+B == B+A.

2) The type has an additive identity, that is A+0 == A, where 0 is the additive identity.

3) The type has a commutative multiplication, that is A*B == B*A.

4) The type has a multiplicative identity, that is A*I == A, where I is the multiplicative identity.

5) The type's additive identity, when multiplied with another object of the type, results in itself. That is, A*0 == 0.

6) The type has a division operation. For example if A*B == C, then C/B == A, as long as B is not the additive identity.

A user-defined C++ type T will work as long as the expression T(0.0) represents the additive identity and T(1.0) represents the multiplicative identity. It should also have overloaded operators defined for =, ==, !=, +, +=, -, -=, *, *=, /, /=, and unary -.

The matrix class makes use of exceptions to report run-time errors. It uses a few predefined exceptions which are declared in the standard C++ header file <stdexcept>. I use the following exceptions in my matrix class:

std::range_error — if range checking is enabled, this exception is thrown as the result of accessing an out-of-range element in the matrix.

std::domain_error — If you attempt binary operations on matrices that are of incompatible orders, this exception is thrown. It is also thrown for unary operations which are only valid for square matrices, as well as attempting to construct a degenerate matrix.

overdetermined and underdetermined — these exception classes are derived from std::domain_error. These may be thrown if the solution for the quotient matrix is under-determined or over-determined.

Representation

The matrix template class is defined in Listing 1. (For space reasons, portions of this class have been omitted from the listing. The complete listing is available electronically. See p. 3 for details.) The matrix class consists of three data members: the number of rows and columns, which define the order of the matrix, and an STL vector to contain the elements.

In my matrix implementation, I use const data members to represent the number of rows and columns in the matrix object. Data members which are const can be assigned only during construction of the object.

The implication of having the order specified by const data members of the class is that the order of the class cannot be changed after it is constructed. I made these members public so that the values can be easily accessed by users of the class. However, by using the const qualifier, I specify my intent that they should not be changed. (A malicious user of the class can still possibly change these member variables by using casts to remove the constness).

It would be possible to write a surrogate template class (a class that would represent the base matrix class by storing an object of that type as a member variable) which would use non-type template parameters to represent the order of the matrix. This would have the advantage of making the order of a matrix part of the object type. Logically, the order of a matrix has more to do with the object type than the representation. Doing so would make it possible to catch errors at compile-time rather than at run-time, such as using incompatible orders of matrices in addition or multiplication functions.

For instance, the matrix multiplication operation could be defined by the template function:

template<class T, unsigned m, unsigned n, unsigned p>
    matrix<T,m,n> operator*(
        const matrix<T,m,p>& A,
        const matrix<T,p,m>& B);

With this kind of function, the compiler wouldn't allow you to multiply a 2 x 3 matrix with a 1 x 3 matrix, since it would not match the template. But you could multiply a 2 x 3 matrix with a 3 x 1 matrix, with a result type being a 2 x 1 matrix.

Unfortunately, neither of the C++ compilers that I have (Microsoft Visual C++ 4.1 and Borland C++ 5.01) properly implements the operator* function I wrote above. Microsoft Visual C++ won't take non-type template parameters to a template function, and Borland C++ won't match the template function to a multiplication expression involving two matrices of the correct order.

Implementation

The matrix class has two constructors, a copy constructor and a constructor where the order of the matrix is specified, along with a pointer to an optional array of values used to initialize the array. If this pointer is null (the default argument), each element of the array is initialized to the additive identity of the element type. As I mentioned earlier, it is required of the element type T that the expression T(0.0) specify the additive identity. For example, double(0.0) specifies the additive identity for floating point numbers, and complex<double>(0.0) specifies the additive identity for the complex<double> type.

Note that I write T(0.0) and not

T(0). This is because 0 can convert to a pointer in C++. If the type T has two constructors, one that takes a double and the other that takes a pointer, the compiler will call the situation ambiguous and generate a diagnostic.

You can access elements of a matrix object two different ways. The first method defines a member function operator taking two unsigned arguments. If A is of type matrix, A(i,j) accesses the element in row i, column j of the matrix. The second method involves a member function called element that does the same thing. I supplied the latter so that I can write element(i,j) rather than (*this)(i,j) when referring to elements of the class within a member function. Note that element access uses C-style zero-based indices rather than one-based indices typically used by mathematical representations of matrices.

The access functions are conditionally compiled to include range checking, which is enabled by defining the macro RANGE_CHECK. If range checking is enabled, the access functions checks each access for an out-of-range condition. If the access is out of range, the matrix class throws an object of type std::range_error. You can turn off range checking to dramatically improve the performance of the code.

Each of the element access methods returns a reference to the element rather than a copy of the element. Thus, an element access method can be used as an lvalue — the expression can be placed on the left side of an assignment statement.

There are two overloaded versions of each element access method. One version is called for non-const matrix objects, while the other version is called for const matrix objects, so that the class can maintain const correctness. Accessing an element for a const matrix returns a const reference, which protects a const object from having its elements changed. Returning a const reference to the element also avoids a copy operation that would happen if the element were returned by value.

The matrix class provides operators related to scalar multiplication. You can multiply an object of type matrix<T> with an object of type T. The product of this multiplication is a matrix of the same order, with each of its elements multiplied by the scalar operand. Similarly, division by a scalar operand is also defined.

The matrix class provides operators for matrix addition and subtraction, where both operands are matrices. The result of the operation is a matrix of the same order of the operand matrices, with elements that are the result of addition and subtraction of the corresponding elements of the operand matrices. Matrix addition and subtraction require that the each operand be of the same order. If they are not, these methods will throw an object of type std::domain_error.

Matrix multiplication is defined as follows: If matrix A is of order M x P, and matrix B is of order P x N, then the product C is of order M x N and each element is the result of the summation:

The member function

matrix<T> matrix<T>::operator*(
    const matrix<T>&) const

defines this operation. If you use matrices with orders that cannot be multiplied, the method will throw a std::domain_error exception.

Note that matrix multiplication is not commutative. That is, A*B == B*A is not necessarily true or even valid.

One technique that is used to increase the accuracy of summing floating-point numbers in the matrix multiplication function is an algorithm known as Kahn Summantion. For a discussion of the Kahn Summnation algorithm, see the article ``Floating-Point Summation'' by Evan Manning in the September 1996 C/C+ User's Journal.

I have implemented the Kahn Summantion algorithm with a template class called kahn_sum. A kahn_sum object represents a running sum of an object of the type of the template parameter. It provides an operator+=, an operator-=, and a conversion operator.

If you use a type for the matrix that isn't based on floating-point numbers, (for instance, you may want to use a user-defined rational type or fixed-point number type, which won't benefit from Kahn Summation) then you may want to provide a specialization of the kahn_sum template class that doesn't use an error-correcting term.

Matrix Division

Matrix division is the most complex operation for matrices. For floating-point number types, the most efficient way to solve matrix divisions is to use an elimination technique like LU decomposition to reduce a matrix equation to a form which makes it easier to solve. If you need an efficient matrix division operation for floating-point numbers, you should not use this template class; or you should provide specializations for float, double, or long double that use a more efficient technique.

However, the problem with elimination is that it involves many divisions operations on the individual elements, which can cause rounding errors that degrade the result. In my matrix class, I assume that element division operations are not desirable, so I use a technique which is expensive computationally, but gives accurate results, since at most only one division is required for each element of the solution.

Since matrix division is not commutative, there are two different division operations. The matrix class defines the member functions leftdiv and rightdiv. Left division is the solution Q to the equation:

D * Q = N

Right division is the solution Q in the equation:

Q * D = N

The matrix class defines a member function called transpose that returns a copy of the operand matrix where the columns in the operands become the rows in the result. The transpose operation has the properties:

A == ( A^T )^T

if A*B == C, then

B^T * A^T == C^T

Using these properties of transpose, the right division can be defined in terms of the left division operation using the identity:

N rightdiv D = ( N^T leftdiv D^T )^T

For my implementation of the matrix class, the rightdiv member function is implemented with the leftdiv and transpose member functions. I've also overloaded the division operator to use the rightdiv member function.

The matrix class has a few member helper functions (not shown) which are involved in matrix division. The first member function, minor( unsigned i, unsigned j ), calculates the minor matrix of element(i,j) of the matrix. The minor matrix is the matrix with row i and column j deleted.

The matrix class has a member function called det (not shown) which calculates the determinant of a matrix. Determinants are used for applying Kramer's rule, which is used in the matrix division operator. Determinants can only be calculated for square matrices, so the determinant function will throw a std::domain_error exception if you try it on non-square matrices. The definition of the determinant of a matrix is recursive, and is as follows:

If the matrix is of order 1 x 1, the determinant is the value of the only element of the matrix.
To calculate the determinant of a matrix A of order N x N, take any row i of the matrix. The determinant is the result of the summation:

You can also calculate the determinant across a column of the matrix. In the matrix class implementation, I simply just use the top row for calculating the determinant.

For efficiency's sake, I have hard-coded the calculation for the determinant for order 2 x 2 and order 3 x 3 matrices, to avoid the overhead of constructing minor matrices. I also use a member function called minor_det, which returns the same result as Matrix.minor(i,j).det(). The advantage of using minor_det is that it avoids copying the resulting minor matrix to a temporary object just to call the det member function.

I've also employed the kahn_sum template class to increase the accuracy of summing the terms when calculating the determinant.

Kramer's Rule

The heart of the matrix division function is Kramer's rule, a method for solving a system of linear equations. For instance, if you have a system of linear equations with three unknowns, as in:

a₁ x₁ + b₁ x₂ + c₁ x₃ = d₁ a₂ x₁ + b₂ x₂ + c₂ x₃ = d₂ a₃ x₁ + b₃ x₂ + c₃ x₃ = d₃

you can represent this system with the matrix equation:

Kramer's rule says that to solve for nth x, you divide the determinant of the matrix created when you replace column n of the coefficient matrix with the product matrix by the determinant of the coefficient matrix. If the system of linear equations has no solution (it is under-determined) the determinant of the coefficient matrix will be the additive identity (also known as zero). If this is the case, the leftdiv method will throw an underdetermined exception object.

Kramer's rule is used to solve divisions where the numerator is of order N x 1, and the denominator is of order N x N. If the numerator has more than one column, the division operation is broken down so that it calls the division operator function recursively. For each column in the numerator, it calls the division function using a column of the numerator to solve for a column of the quotient.

If the denominator matrix has fewer rows than columns, then the division is automatically under determined, and there is no solution. The division operator function will throw an underdetermined exception object.

If the denominator matrix has more rows than columns, the division may either be possibly over determined or under determined. To see if one of these error conditions is the case, the division function uses the following algorithm: For each one of the rows, the function creates a new numerator and denominator matrix with that row deleted. Then the function tries that division with the new numerator and denominator (note that this is recursive) in a try block that catches underdetermined exceptions. If each solution is under determined, the division function throws an underdetermined exception object. If there is exactly one solution (the others are under determined), that solution is returned as the quotient. If there is more than one solution, and one or more of the solutions differ from the other solutions, then the quotient is over determined, and the division operator function throws an overdetermined exception object.

For a real-world application that must solve possibly over-determined matrix equations, the generalization provided by the matrix template is not very useful. Usually, you would want to fit the solution using a least squares technique, rather than throwing an over-determined exception. The problem with the generalization is that it does not require the element type to provide a way to evaluate how "close" two different values are to each other.

Other Methods Not Shown

There are a few other methods defined for the matrix class. The identity method, which is valid only for square matrices, returns a multiplicative identity matrix which is of the same order of the given matrix. The multiplicative identity is the only matrix which, when multiplied by a given matrix, yields a product that is the same as the given matrix. The identity matrix is generated by setting all elements of the matrix to the additive identity for the element type except for the elements in the diagonal from (0,0) to (N,N), which are set to the multiplicative identity for the element type instead.

The inverse method, which is also valid only for square matrices, returns a matrix which is the multiplicative inverse of the given matrix. The multiplicative inverse is the matrix that, when multiplied by the given matrix, yields the multiplicative identity of the given matrix. The inverse is calculated by the expression M.identity()/M.

A Demonstration

Listing 2 (matrix.cpp) is a small program which demonstrates use of the matrix template class. I show only one of the tests here (test4), but the complete test program is available electronically (see p. 3).

test1 is a demonstration of solving a system of linear equations with three unknowns using the complex<double> type.

test2 is a demonstration of calculating the multiplicative inverse of a square 7 x 7 matrix filled with random values.

test3 demonstrates solving a possibly over-determined system of linear equations. There are five equations with three unknowns. As initially set up, it has a solution. After solving the system, the test function tweaks the first equation to make the system over determined, and the division is tried again, to demonstrate that the division function throws the overdetermined exception.

test4 demonstrates the versatility of the matrix class. In the header file poly.h (Listing 3) , I've defined a template class used for representing polynomial functions. A polynomial function object represents a function which can be expressed as:

. . . + c_nxⁿ + . . . + c₂x² + c₁x + c₀ + c_-1x^-1 + . . . c_-nx^-n + . . .

where x is an unknown, and C is an array of constant coefficients. Both the unknown and the coefficients are of the same type as the template parameter. The polynomial class has the algebra and appropriate overloaded operators to be used as an element type for the matrix class. test4 sets up a system of linear equations using polynomials and solves it.

test5 demonstrates another use, mixing the matrix and polynomial template classes. First, test5 sets up a 4 x 4 matrix initialized to polynomials with constant values. Then, test5 generates the polynomial function needed to find the eigenvalues of the matrix, by calling (M.identity()*X - M).det(), where X is the polynomial ( x ) used to represent the unknown eigenvalue.

As these examples illustrate, matrices can be used to solve several different programming problems, such as the solving systems of linear equations. By generalizing the matrix class by making it a template, it is possible to use just about any type that acts like a real number. o

Robert Mashlan is an independent software consultant doing business as R2M Software Company, specializing in Windows device drivers. He may be reached at 503-738-0849 or [email protected] on the Internet. His PGP public key fingerprint is 17 F9 F3 35 E8 3C FE 7A B7 CB 19 FB EC 73 77 94. WWW surfers may read his home page at http://r2m.com/~rmashlan.

Previous 1 2 3 4

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

A C++ Matrix Template Class