Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Directed Evolution


Nov98: Dr. Ecco's Omniheurist Corner

Dennis, a professor of computer science at New York University, is the author of The Puzzling Adventures of Dr. Ecco (Dover, 1998), and Codes, Puzzles, and Conspiracy (W.H. Freeman & Co., 1992), Database Tuning: A Principled Approach (Prentice Hall, 1992) and (coauthored with Cathy Lazere) Out of Their Minds: The Lives and Discoveries of 15 Great Computer Scientists (Springer Verlag, 1998). He can be contacted at [email protected].


Dr. Roger Windswift of the Center for Disease Control had chosen his career of ethnobiologist well. Kayaker, underwater photographer, naturalist, entirely comfortable in the jungle or makeshift lab, he looked cooped in by Ecco's MacDougal Street apartment. After shaking our hands, he paced the room as he described his problem, touching objects to sense their texture without interrupting his train of thought. His story unfolded in academic innocence.

"In the mid 1980s, Jeffrey Palmer and L.A. Herbon showed that cabbages and turnips, while close in sequence, differ from each other mostly through a series of reversals," he said.

I saw Ecco's face settle into an expression of bemused indifference.

"A reversal of a string between positions x and y," Windswift continued, "is a permutation of the string in which the subsequence between x and y is reversed but nothing else is changed.

So, starting with a string whose indexes are numbered from 0 to n:

0 1 2 3...x x+1...y-1 y y+1...n,

he obtains a new string with a reversal in the middle:

0 1 2 3...y y-1...x+1 x y+1...n.

We denote such permutations by the low and high indexes of the reversed subsequence, in the above case (x,y). For example, given the string abcdefg the reversal (2,4) yields abedcfg (leave ab alone, reverse cde, and leave fg alone).

"A strain of bacteria closely related to anthrax has recently appeared. We suspect that it was manufactured."

He stopped his pace for a moment and swept his eyes to each of us.

Windswift went on. "Fortunately, this particular strain does not appear to be contagious, but we want to be ready for the next one. Our strategy will be to create antidotes by doing reversals of some secret bacteria that we have. This will serve as a kind of vaccine.

"We have recently developed a technique that will transform a bacteria population A into a new one B that is identical to A except for one reversal that we can control. The technique takes a day to complete. If we want to do five reversals, we will need five days."

"Is there no way to work on many reversals at the same time?" Liane asked.

Windswift looked at the 10-year-old girl for a moment, then said, "Good question. As far as we can tell, the answer is no. That's why we're here. We need to create these antidotes as fast as possible. That means we have to figure out the fewest number of reversals to transform some critical protein sequence to another one, all in vivo."

"Could you give us an example of a reversal problem?" Liane asked.

"Sure, I'll give one that I had trouble doing efficiently. Find the smallest number of reversals that take abcdefg to cfegbad. I challenge you to do it in three reversals," Windswift answered.

Liane thought for a minute and then wrote down:

(0,2), producing "cbadefg"

(1,5), producing "cfedabg"

(3,6), producing "cfegbad"

"Nice job, Liane," Ecco said chuckling. "Well, Dr. Windswift, you can see we have the talent to do rapid, directed evolution. Please give us a real problem now."

"Very well," Windswift said. "But please understand that the sequence I'm concerned with is the start of a critical binding sequence for memory. Please be careful not to reveal this information."

"Yeah, yeah," Ecco said with a smile.

"Here is the start sequence:

mhtvkllcvvfsclcavawasshrqpchsp

"We want to produce the following sequence through as few reversals as possible:

awavchtfrhsscpchvmsvkllvqlcasp

"Can you do it in under 14 reversals? Two weeks is about all the time we have, but we can do 14 reversals in two weeks."

Ecco and Liane worked on the problem for a while. Liane came up with a solution first.

"I can do a few better than you asked for, Dr. Windswift. I've written it down in steps so you can follow it easily. Along with each step, I write the low and high indexes of the subsequence being reversed."

Reader: can you do better than 14? How small a number of reversals can you find?

"Wow," Windswift said. "Do you have any time after school to do some consulting?"

"My uncle keeps me busy enough, thanks," Liane answered sweetly.

Windswift left, but returned a month later. "The anthrax manufacturers have found a new mutation path. Not only can they cause reversals in subsequences, but they can cause subsequences to rotate by one to the right. For example, given the sequence:

abcdefg

the rotation (1,4) would yield

afbcdeg

(the f moves to position 1 and bcde are moved one to the right; remember that the sequence starts at 0).

Fortunately, we've found a process to do a rotation in a day, but now our mathematical problem is to figure out the shortest possible combination of rotations and reversals to transform one sequence to another. This seems hard to us."

"Could be. Please let's have a try," Ecco said.

"Here are the sequences," Windswift said. Start again with:

mhtvkllcvvfsclcavawasshrqpchsp

but end this time with:

mhtcskaavvwalcsfvqpchspshrvcll

Reader: Try to find a short sequence of rotations and reversals to get the ending string from the starting string. Remember that the rotations take a subsequence and rotate it by one position to the right.

Ecco worked on the problem for several minutes in silence. Then he raised his head and asked: "Is there any reason to believe that it is easier to rotate just after reversing?"

"Not that I know of," Windswift said. "Why do you ask?"

"No reason, I just suggest you discuss this possibility with your chemists," Ecco said, as he handed Windswift an eight-step transformation.

Two months later, a small item in a newspaper reported a disease outbreak following a pistachio party in Los Angeles. An unidentified scientist from the Center for Disease Control was on the scene. A few days later, health officials issued an advisory that certain kinds of pistachios should be avoided and a second advisory that people with rashes behind their ears should call CDC in Atlanta. Then the news media returned to its twin obsessions with the stock market and sex scandals.

Last Month's Solution

Here is what Ecco and Liane came up with after 15 exchanges:

0 0 0 0 0 0 0 0 0 0

2 2 2 2 0 0 0 0 0 0

2 2 1 2 2 0 0 0 1 0

2 2 1 1 2 2 0 1 1 1

1 1 1 2 2 2 0 1 1 1

1 2 1 2 2 0 0 1 1 1

1 2 1 2 0 0 1 1 1 1

1 2 1 2 0 0 1 1 1 1

1 2 2 2 2 2 2 2 2 1

1 1 1 1 1 1 1 1 1 1

Here were the exchanges:

1. exchange (0 8) for (2 6)

2. exchange (0 8) for (5 9)

3. exchange (7 4) for (6 2)

4. exchange (3 8) for (8 3)

5. exchange (8 5) for (6 0)

6. exchange (0 6) for (6 1)

7. exchange (9 0) for (0 3)

8. exchange (9 1) for (3 4)

9. exchange (2 6) for (8 1)

10. exchange (4 7) for (4 5)

11. exchange (9 8) for (1 4)

12. exchange (1 9) for (3 9)

13. exchange (7 8) for (0 5)

14. exchange (9 5) for (2 5)

15. exchange (2 9) for (8 6)

Reader Notes

Readers verified that the best solution with the original scoring to the Mates problem (July 1998) was:

tent 1: alan,mike

tent 2: dave,isaac,larry

tent 3: emily,gwenyth,hillary

tent 4: bob,carol,jack,nick

tent 5: felicia,kris,olivia,petra

yielding a total score of 175.

For the case when the campers' preferences dropped by four points (among those with expressed preferences), the best assignment was:

tent 1: alan,issac

tent 2: carol,jack,nick

tent 3: kris,larry,mike

tent 4: bob,felicia,olivia,petra

tent 5: dave,emily,gwenyth,hillary

yielding a total score of 92 (better than the one Dr. Ecco revealed).

Paolo Friedli wondered what Dr. Ecco would have done if there had been 60 kids and 20 tents. I suggested simulated annealing, but there are some excellent genetic-algorithm readers out there.

For example, Mike Williams suggested the following genetic style technique:

arbitrarily assign people to tents

calculate and store happiness

repeat

randomly move some people around

calculate potential happiness

if better than previous best

store best and update current configuration

until tired of trying

Readers who solved this problem include:

Gary Knowles

Michael S. VanVertloo

Andrew Lipson (who considered the problem a bit easy)

Terrence E. Vaughn

Roger Alley

Jean-Francois Halleux

Greg Walker

Jack Boudreau

Mathew Davies (who thought that Dr. Ecco might be suggesting film titles)

Carl Smotricz

Earl Paddon

Onno Waalewijn

Yannick Heneault

Scott McKinney

Iulian Kakulidis

Michael Williams

Manuel Alvarez Fernandez

Tamas Visegrady

Paolo Friedli

Franco Venturi (who also showed that the maximum two-day score if one can't change bunkmates is 262)

Dan Falk

David Seaman

Lee Thomason

Steve Worley

DDJ


Copyright © 1998, Dr. Dobb's Journal

Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.