Bed Against The Wall: Modern Genetic (And Other) Algorithms Explained: Part 3

(This is part 3 in the Modern Genetic Algorithms Explained series, click here to go to the first post, or browse through all the parts with the Table Of Contents at the end of this post.)

In this part, we will look at a "special" genetic algorithm, CHC by Eshelman. The original paper is very hard to find, but it is mentioned in a lot of other works.

The pseudocode looks like this (thanks to the commenters for sorting some problems out):

Create initial population: P

L := length of an individual chromosome

N := population size

Threshold := 

Threshold := MutProb * (1.0-MutProb) * L (or L/4 is also used)

Evolution until timelimit hit or satisfiable solution found:

  CPop := {}

  For i := 1 to N/2 do:

    Choose two random parents: P1 and P2

    If (different bits between P1 an P2) / 2 > threshold do:

      Create children C1 and C2 using half-uniform crossover

      Add C1 and C2 to CPop



  If there are no children in Cpop:

    Threshold := threshold - 1

  Else:

    P := best N individuals from P and CPop



  If threshold < 0

    Cataclysmic creation of new population P

    

Threshold := MutProb * (1.0-MutProb) * L (or L/4 is also used)

There is another pseudocode description in the slides found here.

A few interesting remarks:
(1) N (population size) is almost always set to 50, but can range from 50 - 10000.
(2) The different bits between P1 and P2 can be defined as the hamming distance between them.
(3) Half uniform crossover swaps exactly half of the non-matching bits. However, often a uniform crossover is used, with a chance of 0.5 - 0.8 of swapping.
(4) MutProb is the mutation probability and originally set to 0.35 (35%).
(5) A "cataclysmic event" occurs when there are no children created for a certain period of time. New children can only be made between parents which are different enough. Basically this means: whenever the population converges towards a certain points, a cataclysm occurs.
(6) What this cataclysm will do depends on the actual implementation. Originally, and often, the following method is used: take the single best individual, and put it in the new population. Now, mutate each of its bits with a 35% chance, this will be the second individual. Repeat this to create a new population of size N. Sometimes, the mutation chance is set to a higher value.

My Python - again, based on the code found here - implementation looks as follows (the original problem was unchanged):

from random import randint, random

from operator import add



def individual(length, min, max):

  return [ randint(min,max) for x in xrange(length) ]



def population(count, length, min, max):

  return [ individual(length, min, max) for x in xrange(count) ]



def fitness(individual, target):

  sum = reduce(add, individual, 0)

  return abs(target-sum)



def grade(pop, target):

  summed = reduce(add, (fitness(x, target) for x in pop))

  return summed / (len(pop) * 1.0)



def hamming(ind1, ind2):

  nr_hamming = 0

  for x in range(0, len(ind1) - 1):

    if (ind1[x] != ind2[x]):

      nr_hamming += 1

  return nr_hamming



def cataclysm(pop, target, min, max):

  #keep the best individual, flip 35% of bits to get new individuals

  pop.sort (lambda x, y : cmp (fitness(x, target), fitness(y, target)))

  firstind = pop[0]

  newpop = [firstind]

  for x in range(1, len(pop)):

    nextind = firstind[:]

    for i in range(0, len(nextind)):

      if 0.35 > random():

        nextind[i] = randint(min, max)

    newpop.append(nextind)

  return newpop  



def hux(ind1, ind2):

  child_one = []

  child_two = []

  hamming_dist = hamming(ind1, ind2);

  nr_swaps = 0

  for x in range(0, len(ind1)):

    if (ind1[x] == ind2[x]) or (random > 0.5) or (nr_swaps > hamming_dist / 2):

      #same, just copy to both

      child_one.append(ind1[x])

      child_two.append(ind2[x])

    else:

      #different, swap with .5 probability, until hamming/2 swaps

      nr_swaps += 1

      child_one.append(ind2[x])

      child_two.append(ind1[x])

  return [child_one,child_two]



def evolve(pop, target, min, max, treshold):

  child_population = []

  for i in range(1, len(pop)/2):

    #choose two random parents:

    parent_one = pop[randint(0, len(pop)-1)]

    parent_two = pop[randint(0, len(pop)-1)]

    if (hamming(parent_one, parent_two)/2) > treshold[0]:

      #do hux crossover

      children = hux(parent_one, parent_two)

      child_population.append(children[0])

      child_population.append(children[1])

  if len(child_population) == 0:

    treshold[0]-=1;

    print "No children evolved"

  else:

    p_count = len(pop);

    print len(child_population),"children"

    for x in child_population:

      pop.append(x)

    pop.sort (lambda x, y : cmp (fitness(x, target), fitness(y, target)))

    #for x in pop:

    #    if fitness(x,target) == 0:

    #        print "Perfect individual found:",x

    pop = pop[:p_count]

    print len(pop),"new population, grade:", grade(pop, target)

  if treshold[0] < 0:

    pop = cataclysm(pop, target, min, max)

    print "Cataclysm, newpop length:",len(pop),"grade:",grade(pop,target)

    treshold[0] = len(pop[0]) / 4.0

    print "Treshold is now:",treshold[0]

  return pop

This reminds me: I should really work on a css class for code (update: done), instead of writing everything in monospace. A few remarks:
(1) The implementation is a bit hacky. Python passes everything by reference, except immutable objects. I wanted to pass treshold by reference, which did not work, it being a float and such. That's why I've wrapped it in a list.
(2) I'll use L/4 as the treshold; and I still use a 35% mutate rate, although we are not using bit encoded individuals, though we could set this a bit higher if we wanted.
(3) We do crossover by randomly swapping different values with a 0.5 chance, until half of the values are swapped. Probability-wise, this is not the same as randomly picking half of the different bits. This doesn't matter that much for this example, though.

Let's test it:

import sys

sys.path.append("C:\Users\Seppe\Desktop")

from chc_eshelman import *

target = 300

p_count = 50

i_length = 6

i_min = 0

i_max = 100

treshold = [i_length / 4.0]

p = population(p_count, i_length, i_min, i_max)

print "First grade: ",grade(p, target)

for i in range(0,100):

  p=evolve(p, target, i_min, i_max, treshold)

In the first run, it took two cataclysms to reach a completely perfect population (grade is the average score for the complete population, not for the best single individual, it might be possible to have a perfect individual in the first evolution, still, because this problem is so simple, we look at the complete population):

First grade:  66.88

48 children

50 new population, grade: 30.34

44 children

50 new population, grade: 18.92

48 children

50 new population, grade: 10.64

38 children

50 new population, grade: 6.68

40 children

50 new population, grade: 4.74

36 children

50 new population, grade: 3.84

12 children

50 new population, grade: 3.48

12 children

50 new population, grade: 3.12

6 children

50 new population, grade: 3.0

No children evolved

No children evolved

Cataclysm, newpop length: 50 grade: 48.24

Treshold is now: 1.5

46 children

50 new population, grade: 17.36

36 children

50 new population, grade: 7.8

32 children

50 new population, grade: 4.1

20 children

50 new population, grade: 2.76

14 children

50 new population, grade: 2.44

16 children

50 new population, grade: 2.12

22 children

50 new population, grade: 1.68

20 children

50 new population, grade: 1.28

18 children

50 new population, grade: 1.0

No children evolved

No children evolved

Cataclysm, newpop length: 50 grade: 48.86

Treshold is now: 1.5

40 children

50 new population, grade: 21.04

46 children

50 new population, grade: 5.3

36 children

50 new population, grade: 1.56

40 children

50 new population, grade: 0.38

32 children

50 new population, grade: 0.0

Another run only takes four evolutions to reach a perfect population, with a beautiful convergence:

First grade:  51.16

46 children

50 new population, grade: 24.26

46 children

50 new population, grade: 12.6

34 children

50 new population, grade: 5.78

38 children

50 new population, grade: 0.94

34 children

50 new population, grade: 0.0

20 children

50 new population, grade: 0.0

Sometimes however, the algorithm gets stuck in a loop:

...

18 children

50 new population, grade: 1.0

22 children

50 new population, grade: 1.0

24 children

50 new population, grade: 1.0

16 children

50 new population, grade: 1.0

26 children

50 new population, grade: 1.0

24 children

50 new population, grade: 1.0

24 children

50 new population, grade: 1.0

18 children

50 new population, grade: 1.0

18 children

50 new population, grade: 1.0

14 children

50 new population, grade: 1.0

16 children

50 new population, grade: 1.0

16 children

50 new population, grade: 1.0

This has something to do with the way the hamming distance is calculated. Sometimes, a pool of two different solutions will be made, but with more than one different values, thus this will always be above the hamming treshold, but will always create the same children, and the same resulting new population.

For example, the algorithm can get stuck in a pool with two types of parents:
1: [83, 19, 67, 64, 23, 44], sum 300 (the target was 300, so fitness: 300-300 = 0: perfect) 2: [38, 28, 67, 64, 6, 97], sum 300

Both are optimal (but the sum might also be both 299, or 299 and 301, etc)... Notice that the hamming distance between them is four, far above the treshold, thus the following children can be created:
[38, 28, 67, 64, 23, 44], sum 264 [38, 19, 67, 64, 6, 97], sum 291 [83, 28, 67, 64, 6, 44], sum 292 [83, 19, 67, 64, 6, 97], sum 336

However, these children all perform worse and will never be considered for the new population, and this is how we get stuck in a loop.

If we'd used a bit-representation, or other workarounds, this would've worked better. For example use another check: if there is no change in the population members: do treshold := treshold - 1. Still, it's good enough to show the workings of the algorithm.

In conclusion, CHC performs very well with only a very limited population size, even in problems where local maxima are common.

If you want to download the source code used in this post, you can find it here.

-----

Table Of Contents (click a link to jump to that post)

1. Introduction

2. Genetic Algorithms

3. CHC Eshelman

4. Simulated Annealing

5. Ant Colony Optimization

6. Tabu Search

7. Conclusion

6 comments :

AnonymousFebruary 20, 2009 10:55 PM
very nice .. in your pseudo code for the CHC algorithm
the

"Add C1 and C2 to CPop"

needs to be indented a bit to be in line with
the If-statement. Your code implements this
nicely.
MacuyikoFebruary 21, 2009 4:03 PM
Thanks for catching that - fixed now.
AnonymousFebruary 23, 2009 4:51 PM
Great - by the way, did you reformulate the original algorithm to what you have above?

I find your version much clearer than the
original from Eshelman's 1991 publication (which
I tracked down).

Again, great explanation for the CHC.
MacuyikoFebruary 24, 2009 1:26 AM
Thanks. The reformulation is taken (and rewritten a bit) from the reddit-discussion (mentioned in the first post in this series). It's a really good high-level description. Remarks (1) to (5) are mentioned in various other sources, although the original (1991) values for CHC still seem to work best.
AnonymousApril 14, 2009 10:41 PM
Hello,

It's me again .. two more corrections based on the original Eshelman
paper (I've been reading a lot about this algorithm lately):

If different bits between P1 an P2 > threshold do

should be

If (different bits between P1 an P2)/2 > threshold do

ie if the (hamming distance/2) ...

and at the bottom of the loop after the cataclysmic mutation

Threshold := L/4

should be set to

Threshold := MutProb * (1.0-MutProb) * L

MutProb = mutation probability, usually set to 35% for CHC
Again, all on the original pseudo code given by Eshelman.

Hope that helps.
MacuyikoApril 20, 2009 11:01 PM
Both Threshold := MutProb * (1.0-MutProb) * L
and Threshold := L/4 are used, depending on the paper/implementation, I've edited the pseudocode though with this more correct treshold.

Hamming / 2, you're right, apologies for the error - thanks for catching that!

Bed Against The Wall

Tuesday, January 13, 2009

Modern Genetic (And Other) Algorithms Explained: Part 3 - CHC Eshelman

6 comments :