Computer scientists at Carnegie Mellon University have devised an innovative and elegantly concise algorithm that can efficiently solve systems of linear equations that are critical to such important computer applications as image processing, logistics and scheduling problems, and recommendation systems, and widely used to model real-world systems, such as transportation, energy, telecommunications and manufacturing that often may include millions, if not billions, of equations and variables.
Solving these linear systems can be time consuming on even the fastest computers and is an enduring computational problem that mathematicians have sweated for 2,000 years. The new algorithm, devised by by Gary Miller, Ioannis Koutis. and Richard Peng, and described in Approaching Optimality for Solving SDD Linear Systems, employs powerful new tools from graph theory, randomized algorithms and linear algebra that make stunning increases in speed possible.
The algorithm, which applies to an important class of problems known as symmetric diagonally dominant (SDD) systems, is so efficient that it may soon be possible for a desktop workstation to solve systems with a billion variables in just a few seconds.
A myriad of new applications have emerged in recent years for SDD systems. Recommendation systems, such as the one used by Netflix to suggest movies to customers, use SDD systems to compare the preferences of an individual to those of millions of other customers. In image processing, SDD systems are used to segment images into component pieces, such as earth, sky and objects like buildings, trees and people. "Denoising" images to bring out lettering and other details that otherwise might appear as a blur also make use of SDD systems.
A large class of logistics, scheduling and optimization problems can be formulated as maximum-flow problems, or "max flow," which calculate the maximum amount of materials, data packets or vehicles that can move through a network, be it a supply chain, a telecommunications network or a highway system. The current theoretically best max flow algorithm uses, at its core, an SDD solver.
"In our work at Microsoft on digital imaging, we use a variety of fast techniques for solving problems such as denoising, image blending and segmentation," said Richard Szeliski, leader of the Interactive Visual Media Group at Microsoft Research. "The fast SDD solvers developed by Koutis, Miller and Peng represent a real breakthrough in this domain, and I expect them to have a major impact on the work that we do."
A number of SDD solvers have been developed, but they tend not to work across the broad class of SDD problems and are prone to failures. The randomized algorithm developed by Miller, Koutis and Peng, however, applies across the spectrum of SDD systems. The team's approach to solving SDD systems is to first solve a simplified system that can be done rapidly and serve as a "preconditioner" to guide iterative steps to an ultimate solution. To construct the preconditioner, the team uses new ideas from spectral graph theory, such as spanning tree and random sampling.
The result is a significant decrease in computer run times. The Gaussian elimination algorithm runs in time proportional to s^3, where s is the size of the SDD system as measured by the number of terms in the system, even when s is not much bigger the number of variables. The new algorithm, by comparison, has a run time of s[log(s)]^2. That means, if s = 1 million, that the new algorithm run time would be about a billion times faster than Gaussian elimination.
Other algorithms are better than Gaussian elimination, such as one developed in 2006 by Daniel Spielman of Yale University and Miller’s former student, Shang-Hua Teng of the University of Southern California, which runs in s[log(s)]^25. But none promise the same speed as the one developed by the Carnegie Mellon team.
"The new linear system solver of Koutis, Miller and Peng is wonderful both for its speed and its simplicity," said Spielman, a professor of applied mathematics and computer science at Yale. "There is no other algorithm that runs at even close to this speed. In fact, it's impossible to design an algorithm that will be too much faster."