Dr. Dobb's | Feedback Strategies

Feedback Strategies

November 2005 Dr. Ecco's Omniheurist Corner.

November 01, 2005
URL:http://www.drdobbs.com/parallel/feedback-strategies/184406332

November, 2005: Feedback Strategies

Dennis, a professor of computer science at New York University, is the author of four puzzle books: The Puzzling Adventures of Dr. Ecco (Dover, 1998); Codes, Puzzles, and Conspiracy (Freeman 1992, reprinted by Dover in 2004 as Dr. Ecco: Mathematical Detective); and recently Dr. Ecco's Cyberpuzzles (W.W. Norton, 2002); and Puzzling Adventures (W.W. Norton, 2005). With Philippe Bonnet, he has written Database Tuning: Principles, Experiments, and Troubleshooting Techniques (2002, Morgan Kaufmann). With Cathy Lazere, he wrote Out of Their Minds: The Lives and Discoveries of 15 Great Computer Scientists (1995, Copernicus/Springer). He can be contacted at [email protected].

Dr. Ecco Solution

"I promise that you will enjoy this one," Baskerhound said, coming in for a surprise visit. "It will resonate with your theories of human nature. Let me start with the populist pitch.

"Have you ever admired your own skill at navigating a narrow road at high speed? If not, imagine the following alternative method of travel: Pore over a detailed map of the same road, figure out how much the wheel should turn and the accelerator should be pressed at every time point, and then drive down the road blindfolded. Even without obstacles, this is beyond the memory and trigonometric capacity of most of us."

"I will grant you that," Ecco replied with a sly smile.

"Instead, we're hardly conscious of the intellectual effort of driving," Baskerhound continued. "Perhaps the reason is that the act of driving consists of very short-term plans (a few seconds at most) followed by adaptation based on eyesight. The driver has an overall goal—get to the end of the road—but the plan is incremental and adaptive. This requires less brainpower and is far more robust to changes in the environment.

"Any person on the street understands this argument, but my bosses require quantification. So to make this concrete, I have proposed the following game. Consider this standard checkerboard that has 8 rows and 8 columns (see Figure 1).

"You want to go from row 1, column 4 (the black square above the S) to row 8 column 5 (the black square below the E). Each move goes from black square to black square and proceeds up a row and either to the left or right diagonally adjacent square. If you fall off the checkerboard or reach the top row without reaching the correct square, you lose.

"At each move, you get to aim to go either right or left. You will achieve that step's aim with probability Pgood, whose values we will discuss in a minute. There are two kinds of strategies: FeedYes and FeedNo.

"A FeedYes strategy can decide where to aim on the ith move after seeing the results of the first i-1 moves. A FeedNo strategy must decide where to aim at step i from the very beginning.

"Here is an example to show you the difference. Suppose that you want to go from row 1, column 4 to row 3, column 4. Suppose that Pgood is 0.9. Then in the FeedYes strategy, you might aim right the first move. If you in fact go right (probability 0.9), then you would aim left the second move. But if you go left on the first move (probability 0.1), you will aim right the second move. The net result is that you have a probability of 0.9 to hit your destination. In the FeedNo strategy, you might do something like aim right the first move and aim left the second. There are two cases in which you would win with that strategy: You in fact move right in move 1 and left in move 2 (probability 0.9×0.9=0.81) or you move left in move 1 and right in move 2 (probability 0.1×0.1= 0.01). So FeedNo has a probability of 0.82 of hitting the destination.

"Call the feedback dividend the probability of hitting the destination with the optimal FeedYes strategy divided by the probability of hitting it with the optimal FeedNo strategy. (Optimal means that you do as well as you can based on the probabilities.) In the example here, the feedback dividend is 0.9/0.81.

"Here's a warm-up: Are there any values of Pgood for which the feedback dividend is 1 regardless of source and destination?"

"My dear Benjamin, of course," said Dr. Ecco. "If Pgood were 0.5 or 1, the feedback dividend would be only 1. In the first case, it doesn't matter where you aim. In the second, you don't need feedback. For all other Pgood values, the dividend will exceed 1."

"I didn't think that would be hard," said Baskerhound. "Now here is the full problem. You start at row 1, column 4 and you want to hit row 8, column 5.

"1. If Pgood is 0.9, what is the probability of hitting in the FeedYes strategy and in the FeedNo strategy?

"2. For which value of Pgood does the feedback dividend reach its greatest value? What is the feedback dividend in that case?"

I was quite surprised by the result of this second question. Not intuitive at all.

After Baskerhound left, Ecco asked me one other: "3. If we cut off the three rightmost columns and the two left-most columns, then which value of Pgood would give the highest feedback dividend? Assume that falling off the board gives a certain loss."

DDJ

November, 2005: Feedback Strategies

Figure 1: The checkerboard.

November, 2005: Feedback Strategies

Dr. Ecco Solution

Solution to "Calculation in the Narrows," DDJ, October 2005.

The configuration 5220 5220 5220 5221 5220 5220 5220 5219 5220 5220 5220 5220 5220 never changes because the 5221 never encounters 5219 and all rounding is to the nearest whole number where the initially higher number rounds up. If the number acquiring the higher number were chosen at random, then this would converge.
The following configuration requires 38 rounds when using one decimal point: 1287 6733 7176 4061 9928 797 8570 82 9642 287 7823 823.
The following initial configuration requires 14 rounds 0 0 10000.
Every configuration completes in finite time, because eventually (with probability 1), if two numbers differ by two or more, two people having those numbers will eventually meet.

DDJ