GENETIC ALGORITHMS
Nature's way to search for the best
Richard Spillman
Richard is a professor of computer science and can be contacted at the Department of Computer Science and Engineering, Pacific Lutheran University, Tacoma, WA 98467, 206-535-7406, or BITNET: SPILLMAN_R@PLU.
You've left your keys somewhere in the house, but can't remember where. To find them, you could systematically search every room starting at the top floor and working your way down. Or, you could randomly pick a room, search it, and, if you don't find the keys, randomly select another. Either approach would work fine for a small house, but what if you lived in a 110-room mansion? Both approaches would result in a lot of wasted time (unless you're lucky, but then if you were that lucky, you wouldn't have lost your keys in the first place).
Another approach is to sit down and assign a value to each room in the house, giving each room five points if you have been in the room since the last time you saw your keys. Then begin to randomly search those rooms with high-point totals. If this is your approach, then you know in a small but important way how a genetic algorithm works.
How a Genetic Algorithm Works
Genetic algorithms (GAs), a way to randomly search for the best answers to tough problems, were first suggested by John Holland in his book Adaptation in Natural and Artificial Systems (University of Michigan Press, 1975). Over the last 20 years, they've been used to solve a wide range of search, optimization, and machine-learning problems. As their name indicates, genetic algorithms attempt to solve problems in a fashion similar to the way in which human genetic processes seem to operate (at least at a simple level). A good survey of the nature and use of genetic algorithms can be found in David Goldberg's Genetic Algorithms in Search, Optimization, and Machine Learning (Addison-Wesley, 1989) and G.E. Liepins and M.R. Hilliard's paper "Genetic Algorithms: Foundations and Applications" in Annals of Operations Research (1989). You might also refer to Mike Morrow's "Genetic Algorithms" article in DDJ (April, 1991).
The easiest way to consider a GA is in the context of a function-optimization problem. The goal is to find an integer value between two fixed integers that produces the largest result when substituted into a given function. A genetic algorithm begins with a randomly selected population of function inputs represented by strings of bits. The GA uses the current population of strings to create a new population such that the strings in the new population are, on average, "better" than those in the current population. For example, a population of strings may be generated to search for a maximum value for the function x2-x+ 1, where x is an integer between 0 and 31, as shown in Figure 1. The third element in this sample population represents x = 10 and produces the best function output of the current population of three binary strings. The idea is to use the best elements from the current population to help form the new population. If this is done correctly, then the new population will, on average, be "better" than the old population.
Figure 1: String population.
Population Function New value population ------------------------------------ 0001 1 high function 0110 31 value terms 1010 91
Three processes--selection, mating, and mutation--are used to make the transition from one population generation to the next. The basic genetic-algorithm cycle based on these is shown in Figure 2.
The first step is the selection process. This determines which strings in the current generation will be used to create the next generation. This is done by using a biased random-selection methodology. That is, parents are randomly selected from the current population in such a way that the "best" strings in the population have the greatest chance of being selected. In Figure 1, the string 1010 has the greatest chance of being a parent, while 0001 has the least chance. By using the "best" points to determine the next population, the algorithm seems to move in the most promising direction in its overall search.
The second step is the mating process, which determines the actual form of the strings in the next generation. At this point, two of the selected parents are paired. If the length of the each string is r, then a random number between 1 and r is selected, say s. The mating process is one of swapping bits s+1 through r of the first parent with bits s+1 through r of the second parent. In this way, two new strings are created, as in Figure 3.
The final step is mutation. A fixed, small mutation probability is set at the start of the algorithm. Bits in all the new strings are then subject to change based on this mutation probability. In Figure 4, bits 6 and 10 are mutated. (Bit 6 goes from dark to light, and bit 10 goes from light to dark.) The result is a new generation of strings.
These three steps are repeated to create each new generation. It continues in this fashion until some stopping condition is reached (such as a maximum number of generations).
This algorithm has proven to be very effective in solving some tough search problems. While it may seem to be a random search, in fact, the improvement in each generation indicates that the algorithm provides an effective directed search technique.
Genetic Algorithm Code
Listing One (page 90) is a Turbo Prolog version of a genetic algorithm that searches for the maximum integer value of a function. It consists of 17 procedures, two functions, and a main program. Some of the procedures, such as SetDisplay, GENOUT, clockoff, clockon, Screen_Win, FileOut, and DATAIN are designed as part of the user interface and are not necessary for the genetic algorithm; I won't discuss their function or operation. The OptFunc function is the code that specifies the function to be maximized. If you want to run the code as is, it will search for a maximum value for the function -x3+25x2-2x+55. It is a simple matter to rewrite this function so that you may insert any maximization problem into the code.
When you first run this code, you'll be asked to provide all the necessary parameters for a genetic algorithm. The DATAIN procedure asks for the population size, that is, the number of chromosomes in the population. I recommend at least 12; the program is limited to a population size of no more than 50. (You can change the value of the MaxChrom constant at the beginning of the program if you want a larger population, but 50 is usually more than enough). The next parameter is the number of bits in a gene. This value is determined by the largest integer in your search range. For example, if your search spans the integers from 0 to 1024, then you need 10 bits. If the search is from 0 to 2048, then you need 11 bits, and so on. The program allows up to 200 bits, which represents a large integer-search range. (This value may also be changed by changing the MaxN constant.) The third requested parameter is the number of generations. This is the number of new populations you wish to create during your search. Your search will stop at the number of generations you specify. The next parameter requests a probability value. You may try different mutation probabilities to observe their effect on the success of the search. I recommend beginning with small mutation probabilities such as 0.1 or 0.2. The rest of the parameters are self-explanatory--except for the last one. When you are requested to enter the display update, enter a number between 1 and the total number of allowed generations. If you enter 5, your search will pause after every fifth generation to display the current results. This gives you a chance to see how the GA is doing.
The relevant function and procedures for a genetic algorithm are PARENT, MUTATE, and CROSSOVER. Each performs exactly as required by a classical genetic algorithm. The function PARENT selects a random parent from the current population. It is called twice by the main program to find two parents to send to the CROSSOVER procedure. The CROSSOVER procedure selects a random point and swaps the bits between the parents. It then calls the MUTATE procedure which randomly mutates bits in the two offspring. The result is a new generation of points; they are evaluated, and the best one is saved in BEST.
A Sample Run
If you choose to run the program to find the maximum, use these parameters:
Population Size: 12 Number of Bits: 6 Number of Generations: 5 Mutation Probability: 0.1 Random Seed: 1234 Best Guess for the Max: 2337 Minimum Interval Value: 0 Maximum Interval Value: 5 Display Update: 1
Your initial population is going to look like this:
010010, 001100, 001011, 001011, 001000, 001000, 000101, 000100, 100011, 101010, 111000, 100100
The actual integer max occurs at 17. The random generation of 12 integers in the range 0 to 50 did not do too badly since one of the random points is 18, which is close to the maximum. Of course, with such a small range it is not unusual for the algorithm to do this well. For problems in which the number of possible answers is much greater than 50, it's unlikely that the initial population will have such a good fit. After one generation, the program will find the number 16. By generation 4, it will find 17, the actual integer maximum. On a simple problem like this, it is more interesting to watch the overall improvement in the search. The fitness of each member of a generation is determined by dividing the value of the function at that point by the known maximum value. Hence, in the positive range of the function, this fitness value is between 0 and 1. Of course, for a real problem the maximum would not be known at the beginning, so such a fitness value would not be available. For this problem, however, it is instructive to determine the fitness of each member of a given generation. If you run the program as is, you will notice that the average fitness of each generation increases. The initial random population has an average fitness of 0.39 while the fifth-generation average fitness is 0.58. The point is that the genetic algorithm is finding "better" points as it moves from generation to generation.
Improvements
Nothing in the classical approach to mutation or crossover requires that only those methods be used in a genetic algorithm. You could try any procedure you can think of to mix bits in a population to create a new population. Some of your ideas may not produce a new population better than the previous one for a specific application. On the other hand, some of your ideas may work very well. For example, in place of a random mutation in which a bit is changed from 0 to 1 or 1 to 0 with a low probability, why not swap two bits in the current string? Code in the SWAP procedure does just that. While the current version of the main program does not use SWAP, you may want to add it to the system and observe the results. The best way to add this procedure is to replace the call to MUTATE with a call to SWAP. Perhaps you may want to be more daring and swap bits half the time and mutate them the other half. This could be done by replacing the calls to MUTATE in the CROSSOVER procedure with the code shown in Listing Two (page 93).
Another possibility, which I call "inversion," inverts the order of bits in a string between two random points. If you begin with a string 01110101100011 and note two random points in the string with (), such as 011(101011)00011, then the inversion operation will reverse the order of the bits between the two random points to produce the string 011(110101)00011. A procedure for inversion is also provided in the code of Listing One. It is up to you to add it to the crossover procedure and observe its effects.
These are only a couple of examples of other ways in which the mutation operation could be implemented. The crossover operation could also be changed. The field is wide open. Anything you can think of is a valid possibility.
_GENETIC ALGORITHMS_ by Richard Spillman[LISTING ONE]
<a name="0065_000a"> {**************************************************************************} {* GENERAL GENETIC ALGORITHM--a simple implmentation of a genetic algorithm *} {* for function optimization. Dr. Richard Spillman, Dept.of Computer Science*} {* Pacific Lutheran Uni. Tacoma WA 98447 206-535-7406 BITNET: SPILLMAN_R@PLU*} {**************************************************************************} program GGA; uses Crt,Dos; CONST MaxChrom = 50; {Maximum population size is 50} MaxN = 200; {Maximum bit size is 200} TYPE ChromType = RECORD Chr : ARRAY[1..MaxN] of byte; Fit : real; Val : longint; CFit : real; END; ChromPool = ARRAY[1..MaxChrom] of ChromType; VAR I,J,N,Seed,Dis : integer; PopSize,NumGen,Flip,K : integer; Par1,Par2,MaxI,MinI : integer; Pm,AveFit,Pi,start,stop : real; MAXS : real; A : char; Best : ChromType; Pool : ARRAY[0..1] of ChromPool; OutFile : text; Dupl : BOOLEAN; h,m,s,s100,DL : word; {* DATAIN-Reads in (from keyboard) basic data for the genetic algorithm *} PROCEDURE DATAIN; VAR I : integer; BEGIN ClrScr; writeln; writeln; writeln(' ** Genetic Parameters **'); writeln; write(' ENTER POPULATION SIZE: ');readln(PopSize); write(' ENTER THE NUMBER OF BITS IN A GENE: ');readln(N); write(' ENTER NUMBER OF GENERATIONS: ');readln(NumGen); write(' ENTER MUTATION PROBABILITY: ');readln(Pm); write(' ENTER RANDOM SEED: ');readln(J); write(' ENTER YOUR BEST GUESS FOR THE MAX: ');readln(MAXS); write(' ENTER THE MINIMUM INTERVAL VALUE: ');readln(MinI); write(' ENTER THE MAXMUM INTERVAL VALUE: ');readln(MaxI); write(' ENTER THE DISPLAY UPDATE: ');readln(Dis); IF (PopSize Mod 2) = 1 THEN PopSize := PopSize + 1; END; {** POOLSORT - Sort the pool of chromsomes in order of fitness **} PROCEDURE PoolSort(var Chrom:ChromPool); VAR I,J,M1 : integer; Max : real; TmpChrom : ChromPool; ST : array[1..MaxChrom] of integer; BEGIN For I:=1 to PopSize Do ST[I] := 0; For I:=1 to PopSize DO BEGIN Max := 0.0; For J := 1 to PopSize DO BEGIN IF (Chrom[J].Fit > Max) and (ST[J] = 0) THEN BEGIN MAX := Chrom[J].Fit; M1 := J; END; END; ST[M1] := 1; TmpChrom[I] := Chrom[M1]; END; For I:=1 to PopSize DO Chrom[I] := TmpChrom[I]; END; {of POOLSORT} {** BitsToDec - Converts a binary number to decimal **} PROCEDURE BitsToDec(Chrom:ChromType; var Number:longint); VAR i : integer; power : longint; BEGIN Number := 0; power := 1; For i := N downto 1 do BEGIN IF Chrom.Chr[i] = 1 then Number := Number + power; power := power * 2; END; END; {** OPTFUNC-Evaluates function to be optimized. Insert your function here **} FUNCTION OptFunc(Value:longint) : longint; VAR Temp : longint; BEGIN Temp := Value * Value * Value; OptFunc := -Temp + 25*Value*Value - 2*Value + 55; END; {** INITIALIZE - Generate initial Chromosome Pool **} PROCEDURE INITIALIZE(var Chrom:ChromPool; IntStart:integer); VAR I,J : integer; S,K : longint; BEGIN Randomize; RandSeed := IntStart; FOR I:=1 to PopSize DO BEGIN REPEAT FOR J:=1 to N DO BEGIN S:=random(10); IF S > 5 THEN Chrom[I].Chr[J] := 1 ELSE Chrom[I].Chr[J] := 0; END; BitsToDec(Chrom[I],K); Chrom[I].Val := OptFunc(K); UNTIL (K>=MinI) AND (K<=MaxI); END; FOR J:=1 to N DO Best.Chr[J] := 0; Best.Fit := 0.0; END; {** FITNESS - Determines the fitness of a chromosome pool **} PROCEDURE Fitness(var Chrom:ChromPool); VAR i : integer; Value : longint; CF : real; BEGIN CF := 0.0; For i:=1 to PopSize DO BEGIN BitsToDec(Chrom[i],Value); Chrom[i].Val := OptFunc(Value); Chrom[i].Fit := Abs(Chrom[i].Val/MAXS); If Chrom[i].Fit > 1.0 then chrom[i].Fit := 0.01; CF := CF + Chrom[i].Fit; IF Chrom[i].Fit > Best.Fit THEN Best := Chrom[i]; END; AveFit := CF/Popsize; END; {of FITNESS} {** PARENT - Selects a random parent from the pool **} FUNCTION PARENT(Chrom:ChromPool):integer; VAR Rnd : real; I : integer; BEGIN Rnd := random; I := 1; WHILE Rnd > Chrom[I].CFit DO I := I + 1; PARENT := I; END; {** MUTATE - randomly complements a bit in the chromosome **} PROCEDURE MUTATE(var Chrom:ChromType); VAR I : integer; BEGIN FOR I:=1 to N DO IF Pm > random THEN Chrom.Chr[I] := 1 - Chrom.Chr[I]; END; {** SWAP-Randomly swap a bit with its neighbor; halfthe time with its upper **} {** neighbor, half the time with its lower neighbor. **} PROCEDURE SWAP(var Chrom:ChromType); VAR TMP,SP : integer; BEGIN SP:=random(N); IF SP<2 THEN SP := 2; IF SP>(N-1) THEN SP := N-1; IF random > 0.5 THEN BEGIN TMP := Chrom.Chr[SP]; Chrom.Chr[SP] := Chrom.Chr[SP-1]; Chrom.Chr[SP-1] := TMP; END ELSE BEGIN TMP := Chrom.Chr[SP]; Chrom.Chr[SP] := Chrom.Chr[SP+1]; Chrom.Chr[SP+1] := TMP; END; END; {** CROSSOVER-Standard Crossover function. Selects one random point and **} {** switches the tales of the two parents **} PROCEDURE CROSSOVER(P1,P2:ChromType;Var C1,C2:ChromType); VAR I,J :integer; BEGIN I:=random(N); IF I=0 THEN I:=1; FOR J:=1 to N DO IF J < I THEN BEGIN C1.Chr[J] := P1.Chr[J]; C2.Chr[J] := P2.Chr[J]; END ELSE BEGIN C1.Chr[J] := P2.Chr[J]; C2.Chr[J] := P1.Chr[J]; END; END; {** MERGE-Combine two pools to create a single pool containing best of both **} PROCEDURE Merge(old:ChromPool;var new:chromPool); VAR I,J,K : integer; Tmp1,Tmp2 : ChromType; BEGIN K:=1; J:=1; WHILE J < PopSize+1 DO BEGIN IF old[K].Fit > new[J].Fit THEN BEGIN Tmp1 := new[J]; new[J] := old[K]; K:=K+1; IF K > 5 THEN J := PopSize+1; FOR I := J+1 to PopSize DO BEGIN Tmp2:=new[I]; new[I] := Tmp1; Tmp1:=Tmp2; END; END ELSE J:=J+1; END; END; {** DUPLICATE DETECTION/REPLACEMENT-Finds duplicate elements in pool and **} {** replaces them with new random elements **} PROCEDURE DupReplace(var Chrom:ChromPool); VAR I,J,K,S : integer; DP : Boolean; BEGIN I := 1; K := 1; DP := false; WHILE K <= PopSize DO BEGIN IF DP THEN DP := false ELSE K := I+1; DP := false; J := 1; WHILE (J <= N) and (Chrom[I].Chr[J] = Chrom[K].Chr[J]) DO J := J+1; IF J > N THEN DP := true; IF DP THEN BEGIN Dupl := true; FOR J := 1 to N DO BEGIN S := random(10); IF S > 5 THEN Chrom[K].Chr[J] := 1 ELSE Chrom[K].Chr[J] := 0; END; K:=K+1; END ELSE I := K; END; END; {** FILE OUTPUT - outputs each generation to a file **} PROCEDURE FileOut(Chrom:ChromPool; I:integer); VAR J,K : integer; BEGIN writeln(OutFile,'***************** GENERATION ',I:3,'*******************'); writeln(OutFile); write(OutFile,'BEST FIT: '); FOR J:=1 to N DO write(OutFile,Best.Chr[J]); writeln(OutFile,' Fitness: ',Best.Fit); writeln(OutFile,' Average Fit for this generation: ',AveFit); writeln(OutFile); writeln(OutFIle,' Elasped Time: ',stop); FOR J:=1 to PopSize DO BEGIN FOR K:=1 to N DO write(OutFile,Chrom[J].Chr[K]); writeln(OutFile,' ',Chrom[J].Fit:4:3); END; writeln(OutFile); writeln(OutFile,'*****************************************************'); writeln(OutFile); END; {of FILE OUTPUT} {** SCREEN_WIN - Screen_Win will accept two colors and a row **} {** location to set up a window for input/output **} procedure Screen_Win(x1,x2,y1,y2,fg,bg:integer); var i,j :byte; begin TextColor(fg); TextBackground(bg); for i:=x1 to x2 do begin GotoXY(i,y1); write(#205); GotoXY(i,y2); write(#205) end; for i:=(y1+1) to (y2-1) do begin GotoXY(x1,i); write(#186); GotoXY(x2,i); write(#186) end; GotoXY(x1,y1); write(#201); GotoXY(x1,y2); write(#200); GotoXY(x2,y1); write(#187); GotoXY(x2,y2); write(#188); for i:=y1+1 to y2-1 do for j:=x1+1 to x2-1 do begin GotoXY(j,i); Write(' ') end; end; {** Start the timer **} procedure clockon(var startclock : real;var DL:word); var Y,M,DW : word; Begin gettime(h,m,s,s100); startclock := (h*3600) + (m*60) + s + (s100/100); getdate(Y,M,DL,DW); end; {** Stop the timer **} procedure clockoff(var stopclock:real; startclock : real; DL:word); var i:integer; Y1,M,D,DW:word; BEGIN gettime(h,m,s,s100); stopclock:=(h*3600) + (m*60) + s + (s100/100); getdate(Y1,M,D,DW); if D = DL then stopclock:=stopclock - startclock else stopclock:=(D - DL - 1)*86400 + 86400 - startclock + stopclock; END; {** GENERATION OUTPUT-Prints out the current generation and summary data **} PROCEDURE GENOUT(Chrom : ChromPool;I:integer); VAR J,K,S : integer; BEGIN clockoff(stop,start,DL); GoToXY(6,5); write(stop:5:3); GoToXY(19,5); write(I:4); GoToXY(36,5); write(Best.Fit:5:3); GoToXY(50,5); write(AveFit:5:3); GoToXY(62,5); write(MAXS:7); S := N; IF N > 64 THEN S:=64; GoToXY(6,9); FOR J:=1 to S DO write(Best.chr[J]); IF N > 64 THEN BEGIN S:=N; IF N > 128 THEN S := 128; GoToXY(6,10); FOR J:=65 to S DO write(Best.chr[J]); IF N > 128 THEN BEGIN S:=N; IF N > 192 THEN S:=192; GoToXY(6,11); For J:=129 to S DO write(Best.chr[J]); END; END; GoToXY(72,9); write(Best.Val:6); GoToXY(9,16); write('Chrom Value FIT Chrom Value FIT'); FOR J:=1 to 6 DO BEGIN GoToXY(9,16+J); write(J:2); GoToXY(17,16+J); write(Chrom[J].val:6); GoToXY(27,16+J); write(Chrom[J].Fit:4:3); GoToXY(37,16+J); write((J+6):2); GoToXY(43,16+J); write(Chrom[J+6].val:6); GoToXY(53,16+J); write(Chrom[J+6].Fit:4:3); END; IF (I mod Dis) = 0 THEN BEGIN GoToXY(56,2); write('PAUSE'); GoToXY(56,3); write('HIT RETURN . . .'); readln; GoToXY(56,2); write(' '); GoToXY(56,3); write(' '); END; END; {** DISPLAY - Sets up the standard screen display **} Procedure SetDisplay; BEGIN Textbackground(blue); Textcolor(white); clrscr; writeln; writeln(' GENETIC SEARCH'); writeln; Screen_Win(4,14,4,6,yellow,blue); GotoXY(6,4);write('TIME'); Screen_Win(18,31,4,6,yellow,blue); GotoXY(20,4); write('Generation'); Screen_Win(34,46,4,6,yellow,blue); GotoXY(36,4); write('Best Fit'); Screen_Win(48,59,4,6,yellow,blue); GotoXY(50,4); write('Ave Fit'); Screen_Win(61,75,4,6,yellow,blue); GoToXY(63,4);write('Target'); Screen_Win(4,78,8,13,yellow,blue); GotoXY(6,8);write('Current Best Point'); Screen_Win(6,76,15,25,yellow,blue); GotoXY(8,15);write('Population Sample'); END; {** M A I N P R O G R A M **} BEGIN DATAIN; assign(OutFile,'results.txt'); rewrite(OutFile); writeln(OutFile,' GENETIC RUN'); writeln(OutFile,' ********** PARAMETERS FOR THIS RUN ************'); writeln(OutFile,' Number of Generations: ',NumGen:3); writeln(OutFile,' Size of Chromosome Pool: ',PopSize:3); writeln(OutFile,' Mutation Probability: ',Pm:4:3); writeln(OutFile,' Random Start Number: ',Seed:5); writeln(OutFile,' ****************************************************'); writeln(OutFile); clockon(start,DL); INITIALIZE(Pool[0],Seed); FITNESS(Pool[0]); PoolSort(Pool[0]); SetDisplay; GENOUT(Pool[0],0); I:=0; WHILE (I < NumGen) DO BEGIN FLIP := I Mod 2; K:=1; FOR J:=1 to (PopSize div 2) DO BEGIN Par1 := Parent(Pool[Flip]); Par2 := Parent(Pool[Flip]); CROSSOVER(Pool[Flip,Par1],Pool[Flip,Par2], Pool[1-Flip,K],Pool[1-Flip,K+1]); K:=K+2; END; FITNESS(Pool[1-Flip]); PoolSort(Pool[1-Flip]); Merge(Pool[Flip],Pool[1-Flip]); Dupl := false; DupReplace(Pool[1-Flip]); IF Dupl THEN BEGIN FITNESS(Pool[1-Flip]); PoolSort(Pool[1-Flip]); END; FileOut(Pool[1-Flip],I+1); GENOUT(Pool[1-Flip],I+1); I := I+1; END; GoToXY(56,2); write('DONE'); GoToXY(56,3); write('HIT RETURN . . .'); readln; close(OutFile); END. <a name="0065_000b"> <a name="0065_000c">[LISTING TWO]
<a name="0065_000c"> {** INVERSION - Change the order of bits between two random **} {** points - 100(10110)0110 becomes 100(01101)0110 **} PROCEDURE INVERSION(var Chrom:ChromType); VAR I,K,S1,S2,Tmp : integer; BEGIN IF random < Pi THEN BEGIN S1 := random(N-1); IF S1=0 THEN S1:=1; S2 := random(N); WHILE S2 = S1 DO S2 := random(N); IF S2=0 THEN S2:=1; IF S1 > S2 THEN BEGIN Tmp := S2; S2 := S1; S1 := Tmp; END; K := (S2 - S1 + 1) DIV 2; FOR I:=1 TO K DO BEGIN Tmp := Chrom.Chr[I+S1-1]; Chrom.Chr[I+S1-1] := Chrom.Chr[S2-I+1]; Chrom.Chr[S2+1-I] := Tmp; END; END; END;
Copyright © 1993, Dr. Dobb's Journal