I don't have a lot to say, but this is my little bit.

Friday, October 2, 2009

Another Genetic Algorithm In Python

I rewrote the genetic algorithm script with a better organizational structure. I added an Environment class as a container for the algorithm and a FitnessEvaluator class as a delegate for evaluating the fitness of Organisms. That additional structure means I can encapsulate the selection criteria all in that FitnessEvaluator object, producing some different kinds of results.

In the following code, the fitness evaluator applies a selection criteria which pressures the value string toward being alphabetical, from left to right. It's fun to watch this run, to see letters from the front of the alphabet appear toward the left of the string as the process runs, each character dropping slowly down toward its lowest "energy state" as it were, one letter more than the character to its left. Let's look at the simple code which produces this effect:
for i in range( 0, len( organism.value ) - 1 ):
if( organism.value[i] < organism.value[i+1] ):
organism.fitness += (149 - ord(organism.value[i]) - i - abs(ord(organism.value[i]) - ord(organism.value[i+1])));
else:
break


First, we are looping across the value string from left to right, and we stop at the first character which is not less than the character to its right. For each character which is less than the character to its right, we add points to the organism's fitness, by adding 149 points, but subtracting points for undesirable characteristics. For instance, chars at the front of the alphabet are preferred, so we subtract more points for letters at the end of the alphabet, which we determine using the ord function: 'a' has an ord value of 97; 'b', 98; and so on. So by subtracting that value, we show preference for 'a' over 'b' over 'c', and so on. Then, we show preference for lowering the values of characters toward the left of the string by subtracting i. Finally, we show preference for smaller gaps between letters by subtracting the absolute value of the scalar distance between the two characters.

This is a very different way of calculating fitness than the previous script. As you watch this one run you will see the string build up from the left instead of appearing from all parts of the string.
import random

chars = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j'
,'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't'
,'u', 'v', 'w', 'x', 'y', 'z'
#, 'A', 'B', 'C', 'D','E', 'F', 'G', 'H', 'I', 'J'
#, 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T'
#, 'U', 'V', 'W', 'X', 'Y', 'Z'
#, '1', '2', '3', '4', '5', '6', '7', '8', '9', '0'
]

numberOfRuns = 1;
numberOfOrganisms = 50;
chanceOfMutation = 3 # one chance in this many
numberOfMutations = 3 # up to this many, if mutations are selected to happen at all
matesPerGeneration = 3; # each organism produces this many offspring for the next generation
blank = """zzzzzzzzzzzzzzzzzzzzzzzzzz"""

# an Organism is a wrapper for an internal value and a level of fitness
# Organisms know how to mate with other organisms and produce child organisms
# Organisms know how to mutate themselves
class Organism:
fitness = 0
isOptimal = False
value = blank

# when we sort an Organism, we sort in the alpha order of its value string
def __cmp__( self, other ):
return cmp( other.fitness, self.fitness )

# an organism mates with another organism by comparing the chars of their values
# if the chars are equal, the offspring gets that char
# if they are not equal, the offspring gets one or the other, randomly
def mate( self, other ):
retCell = Organism()
retCell.value = ""
for i in range( (len(self.value) + len(other.value)) / 2 ):
if self.value[i % len(self.value)] == other.value[i % len(other.value)]:
retCell.value += self.value[i % len(self.value)]
elif random.randint(0,1) == 0:
retCell.value += self.value[i % len(self.value)]
else:
retCell.value += other.value[i % len(other.value)]
return retCell

# this function may mutate the value of the organism in random places
def maybeMutate( self ):
# there are three kinds of mutations: insertion, deletion, and replacement
if( random.randint( 0, chanceOfMutation ) == 0 ):
for i in range(random.randint(0,numberOfMutations)):
insertLoc = random.randint(0, len(self.value)-1)
self.value = self.value[0:insertLoc] + chars[random.randint(0, len(chars)-1)] + self.value[insertLoc+1:]

###############################################################################################
# this object will encapsulate the logic of determining fitness of an organism in an environment
class FitnessEvaluator:
def evaluateFitness( self, organism ):
# this fitness evaluation routine gives points when the string is in ever-more
# alphabetical order, until each letter in the string is followed by its neighbor
organism.fitness = 0
for i in range( 0, len( organism.value ) - 1 ):
if( organism.value[i] < organism.value[i+1] ):
# we give points when a char is less than its right neighbor
# the more less it is, the more points
# we assign more points for lessness at the left of the string
# the 'ord' function returns the integer value of the ascii code of a character
# the ord value of 'a' is 97
# 149 = 97 + 26 + 26
organism.fitness += (52 - i - ord(organism.value[i]) - abs(ord(organism.value[i]) - ord(organism.value[i+1])));
else:
break

if( organism.value == "abcdefghijklmnopqrstuvwxyz" ):
organism.isOptimal = True

return organism.fitness


###############################################################################################
# an environment contains organisms and constraints
class Environment:
def __init__( self, numberOfOrganisms, matesPerGeneration ):
# build the list of organisms
self.organisms = []
for i in range( numberOfOrganisms ):
self.organisms.append( Organism() )
self.matesPerGeneration = matesPerGeneration

def beginGenerations( self ):
keepGoing = True
iterations = 0

while( keepGoing ):
# a fresh next generation
iterations += 1

self.advanceOneGeneration()

print "%d Most fit: %d (len %d) ~ <<%s>>" % (iterations, self.organisms[0].fitness, len(self.organisms[0].value), self.organisms[0].value)

for cell in self.organisms:
if( cell.isOptimal ):
keepGoing = False

return iterations

# mutate, mate, and select the organisms one time
def advanceOneGeneration( self ):
self.organisms = self.selectOrganisms(
self.mateOrganisms(
self.mutateOrganisms( self.organisms )
)
)

# sorts the given list of next-gen organisms and selects the top of the list
def selectOrganisms( self, nextGenerationOrganisms ):
for organism in nextGenerationOrganisms:
FitnessEvaluator().evaluateFitness( organism )
nextGenerationOrganisms.sort()
return nextGenerationOrganisms[0:numberOfOrganisms]

# mutate the cells
# evaluate the fitness of the mutated cells
# returns the same list it was passed
def mutateOrganisms( self, organisms ):
for organism in organisms:
organism.maybeMutate()
return organisms

# mate all the cells into a new generation
# this will produce matesPerGeneration*len(cells) next-gen cells
def mateOrganisms( self, organisms ):
offspring = []
for cell1 in organisms:
for i in range(0, self.matesPerGeneration):
cell2 = organisms[random.randint(0, len(organisms)-1)]
childCell = cell1.mate( cell2 )
offspring.append( childCell )
return offspring


###############################################################################################
# Run the system

print "Iteration #\tGenerations"
for i in range( numberOfRuns ):
env = Environment( numberOfOrganisms, matesPerGeneration )
print "%d\t%d" % (i, env.beginGenerations())

No comments:

Post a Comment