Channels ▼


Computer Chess: The Drosophila of AI

Computer Chess: The Drosophila of AI

The domain of computer chess playing is suggested as a general means for quantifying the distance by which we have not yet achieved our stated objectives in artificial intelligence. The game of chess traditionally has been considered, at least in Western societies, as the epitome of intellectual skill and accomplishment. Herbert Simon and later John McCarthy, among the cofounders of AI, have referred to chess as the Drosophila of AI, speaking metaphorically about the importance for genetics of Thomas Morgan's early research with fruit flies, for which he won the Nobel Prize in 1933. This metaphor is appropriate, since the quantification of human chess play has been institutionalized over the last 40 years by giving every tournament player a numerical rating, a metric that also can be used to measure progress in machine performance.

A Brief History of Chess

The invention of chess is at least 2,000 years old and has been attributed variously to Arabians, Babylonians, Castilians, Chinese, Irish, Jews, Persians, Romans, Scythians, and Welsh. The Greeks claimed that Aristotle invented chess, but no such source is truly reliable. The board itself is likely to have been invented by the Hindus (4,000 B.C.), and we know that Queen Nefertari, wife of Pharaoh Rameses II, played a board game in ancient Egypt (1,300 B.C.) that looked like chess, but evidence exists that these early versions of chess may have been played with dice, rather than being a game of pure intellect. After many variations in time and place, the modern game, as we know it today, established in the 16th century with the addition of the en passant pawn capture rule.

In 1993, world champion Gary Kasparov unilaterally created his own world chess organization (The Professional Chess Association or PCA) with the aim of displacing The International Chess Federation (FIDE), which traditionally has supervised tournaments for the world title. FIDE is an organization with which Kasparov has substantive disagreements. The PCA now claims that it alone will sponsor all "championship matches," the most recent of which was in London in which Kasparov beat the designated challenger, Nigel Short of England, by the score of 12-1/2 to 7-1/2. Furthermore, the PCA argues that the FIDE-sponsored Karpov-Timman match in Amsterdam and Jakarta had no merit in so far as a world championship match is concerned, despite the fact that Karpov defeated Timman by a score of 12-1/2 to 8-1/2. FIDE now faces its greatest crisis since it was founded in 1924. The London Times provided $2.54 million for the PCA match, while FIDE struggled to muster only $705,000 (one million Swiss francs). A $1.5 million purse allegedly promised by the Sultan of Oman failed to materialize.

However, few professional players were able to earn a living through their love of the game, and most had other occupations. Judith Pulgar of Budapest, Hungary, is the current woman's champion at age 17, one of the youngest members of the set of about 250 grand masters throughout the world.

History of Computer Chess

The first alleged chess automaton was Hungarian Baron Wolfgang von Kemplen's "Turk," a sham machine with false doors and fraudulent clockworks contrived in 1770. It toured to large audiences throughout Europe. Its secret was a human dwarf cleverly concealed inside, who also happened to be an excellent chess player. It played Napoleon at one time, and Edgar Allan Poe revealed its secret in one of his many essays. Unfortunately, it was destroyed in a fire at the Chinese Museum in Philadelphia in 1954.

Spanish inventor Leonardo Torres y Quevedo developed and authentic algorithmic King/Rook vs. King machine in 1890 at the Polytechnic University's School of Road Works in Spain. A later version of this machine was demonstrated at the Paris World's Fair in 1914. Alan Turing, inventor of the Turing Machine, worked on the problem at England's Manchester University from 1946-1952. Claude Shannon, the father of modern information theory who worked on chess at Bell Labs in New Jersey in 1949, was the first to calculate the total number of chess games you would have to examine if you wished to catalog exhaustively all winning games and therefore play perfectly at each move. That number, 10^120, is called the Shannon number in his honor, a number bigger than the number of atoms in the known universe. Ulan and Stein worked at Los Alamos, N.M., developing a program in 1956. Alex Bernstein and colleagues worked on a program in 1958 running on an IBM 704.

At Rand Corp. in 1959, Allen Newell, Herbert Simon, and Cliff Shaw developed the first program with the aim of imitating the playing styles of human grand masters, a more "cognitive psychology" approach to the program of computer program design. Their program, NSS, running on the Johniac, required more than an hour per move. This work continued at Carnegie Institute of Technology (CIT) (Pittsburgh, Pa.) in the late 1960s on a Bendix G-20. A.D. de Groot's book became a bible for programmers at CIT, using this approach. de Groot observed that grand masters had an uncanny skill at reconstruction tasks that diminished in direct proportion to skill level as you went down the chess ladder to masters, experts, local town champions, and then to average "club players." For example, in Holland, Max Euwe, a former world champion, faithfully could reconstruct (error free) any complex midgame position (with perhaps 25 pieces n the board) flashed on a screen for only five to 10 seconds, while others under the same time constraints made increasing mistakes inversely with their chess rating. Curiously, randomizing pieces on squares gave a uniformly high error rate for all players. Still, grand masters were elusive in articulating why they made the "intuitively" good moves they made compared to lesser players. They searched the space of possible moves more deeply, but also more narrowly. Working with Simon, George Baylor developed an endgame player called Mater about this same time. In 1968 , Barbara Huberman developed another chess-endgame player at Stanford under John McCarthy's supervision.

At MIT (Cambridge, Mass.), Richard Greenblatt and Donald Eastlake developed MAC HACK (1966-1970), a program that achieved notoriety by beating Hubert Dreyfus of the University of California at Berkeley, a well-known detractor of the filed. Enough programs were around so that a tradition was begun of having national and later international chess programs competitions. Computer tournaments were refereed in much the same fashion as human tournaments, usually in association with ACM conferences. Early notable computer chess matches were documented in the ACM SIGART newsletter. From 1970 to 1978, a series of programs entitled Chess 3.0-4.7 running on CDC equipment and developed by David Slate, Larry Atkin, and Keith Gorlen of Northwestern University (Chicago) was the major competitor. In 1983, CRAY BLITZ, which now runs on an YMP-8 supercomputer, was strongest.

In 1985, HI-TECH was developed at Carnegie Melon University (CMU) (Pittsburgh) under Hans Berliner, Carl Ebeling, and Murray Campbell. As well as being a programmer, Berliner was a strong player, having won the World Correspondence Chess Championship for several years. Intense diligence is required in postal chess, since a single game may take up to three years to complete and the slightest error will lead to defeat. Berliner also was the first to appreciate the "horizon effect" in a rigorous manner. This is the problem of requiring an arbitrary cutoff point in the program's "next move" generator and evaluating a position prematurely (before all possible exchanges had been achieved). Thus, the real "utility" of that particular line was never determined, while the possibility for disaster always loomed. Also, approximate evaluation functions that may be pathological in that deeper search may lead to worse decisions. This apparent paradox is produced by the accumulation of errors during evaluation.

At present, the strongest program contender has evolved from the CMU team, dubbed "Deep Thought" (following the Watergate incident), and has now migrated to IBM Watson Research Labs (Hawthorne, N.Y.) with Murray Campbell, Feng-Hsiung Hsu, Thomas Anatheraman, and Andreas Nowatzyk. Deep Thought II runs with 24 custom chess hardware processors attached to an IBM RS6000/550 front-end workstation. Last year, Deep Thought II coasted through all rounds of the latest (five-round Swiss-style) Computer Chess Championship Tournament in Albuquerque, N.M., with a perfect score of 5-0, winning a $4,000 prize. The program has a rating of 2,550-plus. It examines 10-ply at each move (and even more during the end game) and averages three minutes per move, typically examining 900 billion positions.

Microcomputer Chess History

While all this was going on, about 15 years ago some entrepreneurs with no pretensions to build a world-class chess machine decided that a market existed for chess software on PCs targeting ordinary players. This gave rise to today's Chess Challenger, Sargon V, Elite Premier, Knightstalker, Socrates, Mephisto RISC, Grand Master Chess, Chess Master 3000, and others that offer their software for prices that typically range from $20-40 at local home-PC hobby outlets. What is astonishing is that these programs became very good, approaching master-level play, when set to their highest level of competence. They also gave inspiration to fabulous three-dimensional computer graphics and sound effects with animated knights charging across he board and clanking their swords, visible at computer shows. Battle Chess 4000 is now out on CD-ROM with high-resolution graphics. Perhaps the last work in hobbyist software is National Lampoon's Chess Manic Five Billion. This latter package has an unusual option in its menus that lets you choose whether the program should be allowed to cheat! For example, in such a case the program might try to remove pieces from the board illegally when you're not looking.


In a now famous prediction, Herbert Simon forecast in 1957 that "within 10 years, a computer would routinely beat the world's best player." In 1967, that predication did not come true. It didn't come true in 1977, nor even in 1987. Philosophers argued that this was evidence that computers cannot beat the best humans, in principle, because of their digital nature and inherent limitations discovered in mathematical logic. However, we now have reason to believe that this prediction may come true before 1997. Before we uncork our bottles of champagne, what is the current evidence for this prediction?

The lower curve in Figure 1 shows the systematic evolution of computer chess programs during the period 1950 to the present year. The upper curve in Figure 1, a horizontal line, shows the slight rise in human chess performance ("inflation") over a similar period of time. The slope of the lower live is observed empirically to be 40 points per year (m2 = 800 points in 20 years), while the slope of the upper line is about two points per year (m1 = 40 points in 20 years). Solving the pair of linear regression equations:

y1 = m1x+b1
y2 = m2x+b2

you can compute the coordinates of the intercept with a time of 1997 and a rating of about 2,820 plus or minus some 10 points of tolerance.

Using a ruler to connect the dots is relatively easy. Extrapolating these points into the future until they intersect is tempting. Although such an extrapolation is easy, at least on paper, the questionable validity of such a procedure depends on whether the assumption of "linearity" will continue to hold in the time remaining. For the human performance curve, it's a good bet. No one expects human chess capacity to change significantly in the next few years. But for computers, some nonlinear process (beyond our current planning horizon) could enter at any time; computer progress has no logical inevitability along this dimension. Indeed, it has been estimated that an additional ply of search could improve performance as much as 250 rating points, but beyond the master level (2,200 points), each ply appears to be worth about an additional 100 points. Thus, a straight-line advance could turn into a sigmoid curve (saturation) and then plateau, or approach human performance asymptotically because some critical resource became scarce or we approached some theoretical limit on computer power like the "speed of light," On the other hand, we have no evidence that any law of physics will block further progress using current technology, at least not for the next few years. To the contrary, applying the principles of brute force and parallel hardware architectures has been suggested, so as to increase the chess-planning horizon to cover more, and more nodes per unit time will result in stronger computer play.

Another notable 10-year prediction was made in Edinburgh in 1968 by John McCarthy in his challenge to David Levy, then the reigning Scottish chess champion: "In 10 years, a program will certainly be able to beat you." Levy essentially replied, "Put your money where your mouth is." So 500 pound bet (then worth $1,250) was made. Donald Michie took on part of the bet. Later, Seymour Papert of MIT joined in. Finally, Ed Kozdrowicki of the University of California at Davis joined in, as a fourth member of the consortium. Ultimately, the consortium lost. Note that Levy had a rating of 2,440 at that time, and his competition, CHESS 4.7, the best program of the day, was nowhere near that level. However, after another 10 years, programs had crossed the 2,440 rating by a considerable margin of 100 points, and Levy, in fact, systematically had lost to Deep Thought II in London in 1989. You would have to conclude that the consortium's heart was in the right place, but its timing was off. Levy and Omni magazine had offered a prize of $5,000 for the first program to beat Levy with no particular time limit attached.

As an added incentive, Ed Fredkin offered $100,000 as a prize to the designers of the first computer program to beat the current human world chess champion under stringent tournament conditions. Deep Thought, which barely lost to Anatoly Karpov on one occasion in 1990, attempted to beat Gary Kasparov twice in October 1989, but Kasparov studied all of Deep Thought's published games and gained some insight into its strengths and weaknesses, but he did not attempt to exploit idiosyncratic weaknesses in the program as might have been revealed to him by insiders from the programming team. In other words, he played it as though he were playing a human.

However, the quality of human and computer play is relative to the horizon effect, and at least for computers this is related exponentially to the number of nodes (positions) that can be examined per second. Most of today's best programs examine a range of 5,000-10,000 nodes per second. But, for the most part, they are standard serial machines, limited by the speed of the processor. Parallel architecture machines, however, are scalable and open us up to new levels of performance, not limited by the state-of-the-art of individual silicon or gallium-arsenide chip line-widths measured in microns. Using a parallel custom hardware approach, Deep Thought I examined 750,000 nodes per second in its day. Deep Though II can examine five million nodes per second.

Murray Campbell recently reported that the VLSI chess chip under development by IBM for some time is now completed, and a 10-processor prototype has been tested internally in August 1993. This prototype should be capable of 30-50 million nodes per second. Judith Pulgar, the 17-year-old woman's chess champion from Hungary, who beat Boris Spassky in Budapest in February 1993, played this prototype recently and lost (under the speeded-up nontournamnet conditions of 30 minutes per side). After the match Polgar said "Chess is about 30% psychology, and you don't have this with the machine. You can't confuse it....Soon it will be just a matter of time before humans are doomed."

Deep Blue, a planned 1,000-processor machine, should be completed by the end of 1993 and will have a capability of one billion nodes per second. It is speculated that, in terms of chess rating, this machine will have a rating of 3,400, approximately 500 points higher than the best humans. Of course, the scalable parallel architecture utilized would not preclude building an even larger 10,000-node machine, if desired, since the nonrecurring engineering costs of chip fabrication, once the design has been paid for, are comparatively small.

The Deep Blue team expects to arrange a match with Kasparov in early- to mid-1994 after testing has been completed. As a footnote, Campbell reported that the cost of the Deep Blue development to IBM likely will be approximately one million dollars, and therefore the Fredkin prize will turn out not to have been that much of an incentive after all, even assuming that they could satisfy all the prize's boundary conditions.

Therefore, in light of these conditions, the real time for the Simon prediction to come true may be sometime in 1994, as a result of a different sort of nonlinear intervention that will work in our favor rather than against us--namely, parallel architectures. When I asked Simon about his prediction back in the 1960s, long before parallel machines were being considered, he seemed embarrassed about having his name associated with the prediction, but stuck to his guns on the grounds that the "10-year" prediction was predicated on the assumption that more people in the AI community would be attracted to the problem and would have worked on it more. Since this scenario didn't happen, his prediction was, in his opinion, still valid (in terms of 10 years' worth of intellectual effort). Perhaps, when all is said is done, he will not be happy with the method by which his dream was accomplished--machinomorphic brute force rather than as a stepping stone to a universal set of principles about human thought processes (a theory of bounded rationality) that would help us scale up to a broader class of intellectually interesting grand challenges.

The Future

Assuming that no disasters overtake us and IBM keeps building computers, there is no reason to believe that the positive linear-exponential model will not remain essentially valid for the foreseeable future. Therefore, there is no reason to believe that machines will barely exceed human skill levels and then plateau in their competence. In fact, we should expect that machines will perform this narrow task so well by the end of this century that their skill will be orders of magnitude superior to the best human play. Probably, only a small number of machines will play each other in staged contests, and the general population will be bored by the general population will be bored by their extraordinary efforts, the "What have you done for me lately?" syndrome. The cost of access to such machines over the Internet or some other national information superhighway soon to be in place will fall, so that, in principle, any player who wishes to should be able to play against such machines.

Of course, humans will always play each other into the indefinite future. More books about chess have been published than about any other game. Recreational chess by humans against PCs also will continue to take place for human-learning purposes. Computers will play each other for the sake of world records. However, no one will doubt that Simon's prediction wasn't inevitable after it happens.

What will be the significance of this accomplishment for other areas of AI research? Sadly, not much. The brute force nature of the final solution might tell us how to go about building a parallel architecture "Go" or "Checker" machine, if we are so inclined, but it does not scale up to problems in robotics or expert systems, for example. What it does tell us is that, just as in the case of Drosophila, even when you fully understand the genetics of fruit flies (which we don't), you still have a long way to go before you understand the genetics of elephants or other interesting mammals. For reference, it takes 150 pages of text just to describe the DNA sequence of a simple bacterial lambda phage virus. More complex viruses still are not yet sequenced. Less than 1% of the human sequence is now known, despite a worldwide effort in the scientific community to do so.

Of course, we will have exonerated ourselves in front of some of our habitual critics, and, at least temporarily, it should make us feel good. However, we may acquire some new detractors in the process. Recall that not too long after the heady accomplishments of the Apollo lunar landings, some were complaining, "If we can land a man on the moon, then why can't we do x?" where x stands for things such as feed the hungry, house the homeless, heal the sick, and so on. Only in retrospect can we say that these endemic social problems are truly more difficult than a crisply defined systems problem such as going to the moon. Systems problems are amenable to a solution using a purely technological approach by a dedicated NASA-type institution created for the purpose of solving it, not forgetting the additional incentives of competition, national pride, and the boundary conditions of nearly unlimited access to brain power and financial resources.

The Next Challenge

By developing Deep Blue, we will have sharpened our tools to attack the next AI ground challenges. The virtue of the chess challenge was that it had a ready-made quantitative linear scale associated with it that helped us keep our eye on the target and measure the distance by which we had not yet accomplished our goal. The next grand challenges of speech an natural-language processing, robotics, and common-sense reasoning lack this sort of crisp linear scale. How to define such a metric becomes part of the problem.

Suggested Reading
de Groot, A.D. Thought and Choice in Chess. The Hague: Mounton, 1965.
Huberman, B.J. "A Program to Play Chess Endgames," in Stanford University Technical Report CS 106, 1968, Ph.D. thesis.
Kpec, D., M. Newborn, and M. Valvo. "The 22nd Annual ACM International Computer Chess Championship in Albuquerque, New Mexico," in Communications of the ACM 35(11): 100-110.
Levy, D., and M.M. Newborn. More Chess and Computers: The Microcomputer Revolution and The Challenge Match. Potomac, Md.: Computer Science Press Inc., 1980.
McCarthy, J. "Chess as the Drosophila of AI," in Computers, Chess, and Cognition. T.A. Marsland and J. Schaeffer, (eds.). New York, N.Y. : Springer-Verlag, 1990.
Schaeffer, J., N. Treloar, P. Lu, and R. Lake. "Man vs. Machine for the World Checkers Championship," in AI Magazine 14(2): 28-35.
Schonberg, H.C. Grandmasters of Chess. New York, N.Y.: Lippincott, 1973.
Shannon, C.E. "A Chess-Playing Machine," in The World of Mathematics Vol 4, J.R. Newman (ed.). New York, N.Y.: Simon and Schuster, 1956.
Simon, H.A. Models of My Life: A Remarkable Autobiography of the Nobel Prize Winning Social Scientist and Father of Artificial Intelligence. New York, N.Y.: Basic Books, 1991.
Simon, H.A. and J. Schaeffer. "The Game of Chess," in The Handbook of Game Theory with Economic Applications Vol. 1, R.J. Aumann and S. Hart (eds.). Amsterdam: North Holland Press, Elsevier Science Publishers, 1992, pp. 1-17.
Soltis, A. Chess to Enjoy. Lanham, Md.: Madison Books UPA, 1980.
Waitzkin, F. Mortal Games: The Turbulent Genius of Gary Kasparov. New York, N.Y.: G.P. Putnum's Sons, 1993.

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.