As far as modules in GAF go, genCore is the most important one. Before you read on, however, I strongly suggest you go back and read the page about [Genetic Algorithms]. Understanding what GAs are, what they are good for, and what are the "working personas" there, is very important in understanding the genCore API. Also, the project is written in Java, and code is explained in the Java language, and for convinience I chose to use the Java 1.5.0 syntax. I suggest you have some understanding of that language if you wish to continue reading.
The primary goal of the genCore framework is to achieve an API where building blocks could be used to assemble a GA solution. That is, instead of forcing the development team to implement the same interfaces over and over again, only a minor portion, the problem-specific one, is required. The rest of the code is implemented to support most of the GA techniques, so a development team could choose, for example, whether they want their Individual to represent the solution as a string of bits, or as a linked list object structure.
Some re-naming to terms from the GAs article: In genCore, Individuals are called Genotype, as they carry the genetic data, which is called the Structure of the Genotype. This structure can be represented in many ways, and each representation might require its own Operators. The genetic Pool is called Population, and it contains the Genotypes which fight for survival. The survival of the fittest idea is brought to life by a property carried by the Genotype, called Fitness rank.
Before we dwell into how things are working Today, I'd like to explain how it evolved to it. First, we need to see what "building blocks" really mean - and for that we need to start at the beginning. Lets take the classical method for solving a GA as an example:
public static Genotype solve(Population p) { while (!p.isConverged) { Collection<GenotypeCouple> selectionPool = p.selection(); p.clear(); for (GenotypeCouple couple : selectionPool) { Genotype child = couple.father().crossover(couple.mother()); child = child.mutate(); p.genotypes().put(child.fitness(), child); } } return p.getBest(); }
We could see that for such a solve() method, we would require a GenotypeCouple structure that contains father() and mother() entries, and the interfaces Genotype and Population to be defined as follows:
public interface Genotype { double fitness(); Genotype crossover(Genotype mate); Genotype mutate(); // Used to access the structure of the Genotype. Used by the fitness() method. Object structure(); } public interface Population { Map<Double, Genotype> genotypes(); boolean isConverged(); Collection<GenotypeCouple> selection(); void clear(); Genotype getBest(); // Used to create genotypes initially. Used before called solve(). void initialize(); }
''Note that this is Not code from genCore, nor it ever was: Its demonstration code.''
As you can see, all the Operators discussed [earlier] are here: The Genotype exposes the Evaluation, Crossover and Mutation Operators, while the Population exposes the Selection and Initialize Operators. Lets assume that these interfaces are implemented and that the Genotype interface is representing its Structure using a string of bits, a BitSet. In that case, the crossover(Genotype) and mutate() methods will be implemented to take care of the general case of a BitSet-based Genotype. This structure presents two difficulties, one of specialization and the other of extending the basic interface.
The specialization problem occurs when the basic implementation isn't enough. Suppose the toolkit contains an implementation of the [BitSet|http://java.sun.com/j2se/1.4.2/docs/api/java/util/BitSet.html BitSet]-based Genotype. The crossover() method, if implemented as a one-point crossover operator, would cut the father and mother Genotypes at a random point, and then take the left part of the father and string it with the right part of the mother. If, however, we would like to use a two-point crossover operator for a specific problem, we would need to either override the crossover method, or have the toolkit supply in the another implementation of a Genotype that already supports this kind of crossover operator. Both options are unacceptable, and the more perceptive readers would notice that a good solution to this problem would be the use of the [Strategy Design Pattern], also known as Functors.
To dive into code a little, using Functors would be seen like this (this solution is shown only for the crossover() method, however is implemented for all operators of both the Population and Genotype interfaces):
public interface CrossoverFunctor { Genotype work(Genotype father, Genotype mother); } public abstract class AbstractGenotype { public final Genotype crossover(Genotype mate) { return getCrossoverFunctor().work(this, mate); } private CrossoverFunctor crossoverFunctor; private CrossoverFunctor getCrossoverFunctor() { return crossoverFunctor; } }
The question of how the CrossoverFunctor member is initialized to begin with is irrelevant - It could be initialized in the constructor of the AbstractGenotype or using a factory which returns Functors for the Genotype. What is important, is that now we could create any kind of CrossoverFunctor supporting class, and attach it to our AbstractGenotype. Using only one Genotype class, BitsetGenotype, and several CrossoverFunctor classes (such as SinglePointCrossoverFunctor, TwoPointCrossoverFunctor and UniformCrossoverFunctor, for example), we have a dynamic Genotype solution for every kind of crossover operator.
So far so good. However, this only fixes the specialization problem. In case you forgot by now, the second difficulty was the extension of the basic interface. Suppose that for the travelling salesman problem the solution would require a [maturity operator]. Might seem like a simple task - Just extend the BitsetGenotype and implement a mature() method in the new class - But that kind of solution practically screams: "Who calls the mature() operator?". To call this method, a reimplementation of the solve() method would be neccessary, for every solution type which requires a new operator or a different sequence in the life cycle of the solve() method. Further, such a solution would require a change to the basic BitsetGenotype, which would suggest the toolkit not dynamic enough to support different types of GA solutions.
The solution for this problem is something I tend call the Production Line design pattern. The interfaces of Genotype and Population are reduced to information containers only: No more will they perform operations, nor will operators be in any way connected to them. The operators themselves are in fact an array of the [Visitor Design Pattern], connected to each other in a [Graph Structure]. A single visitor is called "a station", and can be connected to any number of input nodes and any number of output nodes. A node process is started when enough input is received from its input nodes. Then the processing of the input starts, and the result is aggregated to all its output nodes to be handled. This way, the entire solve() method could be transformed to a "production line", and every operator needed could be created as a station, connected to the next operator in line. The concept can be expanded by adding condition stations which receive an input and their result is aggregated only to specific nodes according to a pre-defined condition (see [image|http://ga-fwork.sourceforge.net/docs/examples/images/solve_prod_line.jpg image]).
In genCore, the production line is analogued to electrical circuits, and stations are circuit items. To show a little code, we'll look at how the Genotype and Population interfaces are defined now, and see how the circuit and circuit items are defined.
public interface Genotype { double getFitness(); void setFitness(newFitness); } public interface Population { void addGenotype(Genotype); void clear(); Genotype getGenotype(int rank); // replacing getBest(). getBest() could be implemented as getGenotype(0). } public interface CircuitItem { void addOutputLeg(CircuitItem) void process(Object); }
The process() method of the CircuitItem interface actually accumulates the Object items given to it until enough input is received, and only then real processing begins. Only when real processing is finished, the result is aggregated to all "legs" added using the addOutputLeg() method.
Finally, we reach genCore today. Today, the Circuits design pattern is used to desribe everything from a condition, through a buffer of Genotype objects, to an Operator. Now, for some extended features, some of them are to be released soon:
- GUI Editor: While using circuits is all fine and dandy, it could be havoc to start linking CircuitItem objects together, trying to figure out what the circuit looks like after everything is written out. Using a GUI editor simplifies the task: You drag and drop items, and the GUI editor compiles it into an XML file (desribed soon). Changes made to the XML file or to the diagram on the GUI editor change each other respectively.
- XML Files: Since circuits can be described easily using XML files (see xml file of the previous image), using them both simplifies the code (consider the alternative) and reduces compilation requirements: Changing a circuit structure will not require recompilation, as the circuit is created at runtime.
- XML Schema Definition: To make the creation of circuit XML files and the parsing of such files easier, an XML Schema Definition (XSD) is created to describe such files (See xsd file. Using XSD files, classes could be generated to make parsing easier (Using JAXB instead of SAX or DOM, for example). Using XSD files, the circuit's XML files could be validated for structure integrity before sent to the XML circuit creator.
- Debugging: Debugging GA has always been a task for the selected few. Using the circuits data structure, the GUI could go through each step in the circuit, showing the change in the Genotype and Population structures as they pass each CircuitItem.
- Progress Diagnosis: Using the [Observer Design Pattern] makes progress bars an easy task - Register for the one of the main CircuitItem_s as a _CircuitItemListener and receive notification whenever it is reached in the process. Or even easier - Create your own CircuitItem which is designed to change the GUI of your software according to the Genotype or Population going through it.
To conclude, I'd say that the API of genCore is not flawless. As of today, genCore is before the release of version 0.7, and has some time before releasing genCore 1.0-RC1. However, it is very mature in its nature: The benefits of using genCore as opposed to writing another implementation are endless, including the amount of code needed to write such an engine and the amount of QA which is needed to assure that all the cases are covered by the new engine. The developers at genCore cannot promise perfection - But like the GA themselves, they can promise close-to-best solutions. And the hell with modesty.
