# GPdotNET 4.0 first look: Classification with Neural Networks

After some time of implementation and testing, the new version of GPdotNET is out. Go to codeplex page and download it. This is huge step in development of this project, because it contains completely new module based on Artificial Neural Network and other optimization methods e.g. Particle Swarm Optimization.

Almost all aspects of the architecture are changed, or will be changed when this version would be released.

The new Experiment class will replace existing classes for handling experimental data in GA based models, which is not implemented yet. Also New start page will contain more pre-calculated examples.

For this beta here are the new features:

1. New Start Page will be extended with new examples of Neural Nets : – binary classification, – multiclass classification and – regressions examples.
2. Improved module for loading experimental data, which now supports nonnumeric data like categorical or binary data.
3. Depending of the output column of loaded experimental data different learning algorithm is selected. For example if the column is of categorical type, GPdotNET selects Neural Net algorithm with Cross-Entropy and Particle Swarm optimization learning algorithm. If the output column is numerical, GPdotNET selects the Neural Nets with Backpropagation learning algorithm. Currently only two learning algorithms are implemented. More will be implemented in the next beta release.

## Classification problem with GPdotNET

This topic will give quick tutorial how to model classification problem with GPdotNET by using Neural Nets. For this tutorial you need some experimental data which you can download from this location.

• Open GPdotNET choose New Command
• New Dialog pops up, check Artificial Neural Nets, and Press Ok Button.

After Solver Type selection, GPdotNET Creates “Load Experiment” Page in which you can load experimental data, and define the percentage of data for testing the model.

• Press Load Data button.
• New Popup dialog appears.
• Press File button select the file you previously downloaded.
• Check Semicolon and First Row Header check buttons.

Note: When the data is analyzed correctly you should see vertical line “|” between columns. Otherwise data will not be loaded correctly.

• When the experimental data is formatted correctly, press the “Import Data” button.

The next step is preparing columns for modeling.

For each columns we have to set:

• a) proper type of the column (numeric, categorical or binary),
• b) type of the parameter (input, output or ignore)
• c) normalization method (MinMax,Gauss or Custom normalization of the column values ).

To change Column Type, double click on the cell which shows the column type, combo box appears with list of available types.

• To change Parameter Type, double click on the cell which shows the param type, combobox appears with list of available parameter types.
• To change Normalization Type, double click on the cell which shows MinMax value, combobox appears with list of available normalization types.

Note: you can set only one Output column. Put Parameter Type to Ignore if you want to skip column from modelling.

Now we have experimental data, and we can start modelling process. Before that we need to choose how much data will be treated for testing. Enter 10% for testing the data and press Start Modelling button.

Now we have more Pages:

1. Settings page for setting the parameters of the Neural Nets

2. Run page for simulation of searching solution

3. Prediction page which you can see how solution is good against testing data.

### Settings Page

As you can see in Settings Page you can set various parameters for Neural Nets and Particle Swarm Optimization. Try to train the model with different parameters values. For this tutorial you can leave parameters as are.

### Modeling Page

Modeling page contains two diagrams. The first diagram shows errors with respect of the iteration number which is very useful for monitoring the searching process. The second diagram which is below the previous shows current best solution (blue line) in comparison with the experimental data (red line). Also on the left side of the page, you can see several iteration number, error, and other information about searching process.

### Prediction Page

Prediction page shows how current best model predict data. Predict page contains tabular and graphical representation of predicted data, which is compared with data for testing.

## Features don’t work in this BETA

1. Exporting Neural Network model

2. Saving to gpa file Neural Network Models

# Brood Recombination new feature in GPdotNET

As currently implemented GPdotNET has classic crossover implementation without any intelligent way to exchange genetic materials. In most time classic crossover operation is destructive operation wasting lot of good genetic materials. By including brood recombination crossover can be slightly improved.

Brood recombination simple repeats crossover operation several time on the same parents, with different crossover points. After fitness evaluation of offspring, the best two child are kept and others are discarded. On that way there is a better chance to get better child than with classic crossover. The picture below graphically describes brood recombination.

## Brood Size – new GPdotNET parameter

The first feature which will be implemented is manually setting the Brood Size of crossover. By adding Brood Recombination, we will increase possibility that two chromosomes will exchange the best genetic material they have.

The next feature will be brood size which will be generating dynamically and will be dependent of the generation number.

# Citations of GPdotNET

Recently I have googled about GPdotNET to find out how people use GPdotNET. I was surprised that there are plenty of sites which are published GPdotNET as freeware software. I have also found several scientific papers which citated GPdotNT as well. Some other people used it as elegant example in their  lessons. Some students used GPdotNET in seminars and diploma works, master and phd thesis.

All in all I was very excited about it. So lets list some interested web sites and scientific paper which mentioned GPdotNET.

3.  Genetic Programming article:  http://www.answers.com/topic/genetic-programming

4.  Teaching lessons from  Faculty of Informatics, Burapha University, Thailand:  Evolutionary Algorithms Applied to Finance

5.  Karlsruhe Institute of Technology Paper Work: Evolutionary Algorithms.

6. Scientific paper, ACS Vol.14:  Use of Learning Methods to Improve Kinematic Models

7. Scientific paper, JPE Vol.15: Modeling of Discharge Energy in Electrical Discharge Machining by the use of Genetic Programming

8. Scientific paper, Thermophysics 2012: Evolution algorithms as toolfor optimization of water vapour transport properties of cellular concrete.

10. Scientific paper, : Aspects regarding the use of Genetic Programming in modelling the cutting process

11. Scientific paper, IJMECH Vol.3, : Predicting Of Machining Quality In Electric Discharge Machining Using Intelligent Optimization Techniques

12. Scientific paper, SLAAI Jornal: Framework for Discovery of Data Models Using Genetic Programming

13. Book By AKERKAR, RAJENDRA, Introduction of Artifical Intelligence.

# GPdotNET v3.0 is out

I am very proud to announce GPdotNET v3.0, the new version which brings 3 new solvers based on Genetic Algorithm and several new features. In previous posts I was  writing about those new features:

The source code and Click Once installation are uploaded in codeplex site, so go and grab the new version. hppt://gpdotnet.codeplex.com

I am asking for feedback and bug reports. If you find something strange post a comment here on anywhere in blog site or codeplex site. I will try to answer as quickly as possible.

Here are screenshots with new Assignment and Transportation solvers in GPdotNET:

Loaded  training data of Assignment problem

Simulation Tab page with optimal result of Assignment problem

Loaded  training data of Transportation problem

Simulation Tab page with optimal result of Transportation problem

# C# Matrix Chromosome implementation in Genetic Algorithm

The next version of GPdotNET will be extended with 3 new solvers:

– Traveling Salesman Problem (TSP),
– Assignment problem (AP) and
– Transportation Problem (TP)

More or less the first two solvers are already presented in previous posts. Both TSP and AP can be implemented with vector based representation, which is already posted here. For transportation problem new chromosome representation have to be implemented. Before we go in to the implementation let give short introduction of Transportation problems.

Transportation problem is one of the simplest combinatorial problem. It searches the determination of a minimum cost transportation plan for a single commodity from a number of sources to a number of destinations. It requires the specification of the level of supply at each source, the amount of demand at each destination, and the transportation cost from each source to each destination. Solution is to find the amount to be shipped from each source to each destination in such a way that the total transportation cost have to be minimized.
Let’s assume we have $n$ sources and $k$ destinations. Let’s also assume that amount of supply at source $i$ is $SRC(i)$ and demand at destination $j$ is $DST(j)$. The unit transportation cost between source $i$ and destination $j$ is $CST(i,j)$. Let $x_{ij}$ is the amount transported from source $i$ to destination $j$ transportation problem can be formulated like the following:
Minimize:

$min { \sum_{i=1}^{n}\sum_{j=1}^{m} c_{ij} x_{ij}}$

${ \sum_{i=1}^{n} x_{ij}=a_i}, (i=1,...,n)$

${ \sum_{j=1}^{m} x_{ij}=b_j, (j=1,...,m)}$

${ x_{ij}>=0, (i=1,...,n), (j=1,...m)}$

A typical transformation problem is shown in picture below. In this example we have 3 sources and 4 destination. The supply is shown on Supply column, and destination cost is shown on the last row. Total supply and demand is 45. The unit transportation cost is shown on central table.

The optimal solution is shown in picture below. The total cost is 315.

The most natural way to represent this kind of problem is 2 dimensional matrix, because our solution must be presented as picture show above in tabular data.

Theory behind matrix representation chromosome is beyond this blog post, so in the next section it will be shows only C# implementation with some description. If you want more information behind this implementation you can find it in the book: Genetics Algorithm + Data Structure = Evolution Programs which can be see at Google books at pages 187-191.

Implementation of main method: GenerateMatrix:

///
/// Initialize randomly matrix chromosome for transport problem based on book
/// Genetic Algorithm+Data Structure= Evolution programs
///
///r- number of rows (sources)
///c- number of columns (destination)
///src[] - source values
///dest[] - destination values
/// Cost matrix cost(i,j)
internal static int[][] GenerateMatrix(int r, int c, int[] src, int[] dest)
{
//initi values
int tot = r * c;
var lst = new List(tot);

//prepare values for generation
int counter = 0;
var vals = new int[r][];
for (int i = 0; i < r; i++)
{
vals[i] = new int[c];
for (int j = 0; j < c; j++)
{
counter++;
}
}

while (true)
{
//exit from while must be before list is empty
if (lst.Count == 0)
throw new Exception("null");

int ind = Globals.radn.Next(lst.Count);
int q = lst[ind];

int i = (int)Math.Floor((double)(q) / (double)(c));
int j = q % c;

//if element is visited generate again random number
if (vals[i][j] != 0)
continue;

lst.RemoveAt(ind);

int val = Math.Min(src[i], dest[j]);
vals[i][j] = val;
src[i] = src[i] - val;
dest[j] = dest[j] - val;

bool canBreak = true;
for (int k = 0; k < r; k++)
if (src[k] != 0)
{
canBreak = false;
break;
}

if (canBreak == false)
continue;

for (int k = 0; k < c; k++)
if (dest[k] != 0)
{
canBreak = false;
break;
}
//if all sources and destination are zero, generation is finished
if (canBreak)
return vals;
}
}


This is main function for this type of chromosome because all other operations is based on this method.
The second implementation is Crossover.

///
// Crossover based on Book Genetic Algorithm +Data Structure = Evolution Programs.
public void Crossover(IChromosome parent2)
{

GAMChromosome ch2= parent2 as GAMChromosome;
if(ch2==null)
throw new Exception("ch2 cannot be null!");

int[] srcREM1 = new int[rows];
int[] destREM1 = new int[cols];
int[] srcREM2 = new int[rows];
int[] destREM2 = new int[cols];

for (int i = 0; i < rows; i++)
{
for (int j = 0; j < cols; j++)
{
double m1 = value[i][j] + ch2.value[i][j];
double d = Math.Floor(m1 / 2.0);
int r = (((int)m1) % 2);

value[i][j] = (int)d;
ch2.value[i][j] = (int)d;
srcREM1[i] += r;
destREM1[j] += r;
srcREM2[i] += r;
destREM2[j] += r;
}
}
for (int i = 0; i < rows; i++)
{
srcREM1[i] /= 2;
srcREM2[i] /= 2;
}
for (int j = 0; j < cols; j++)
{
destREM1[j] /= 2;
destREM2[j] /= 2;
}
var mat1 = GenerateMatrix(rows, cols, srcREM1, destREM1);
var mat2 = GenerateMatrix(rows, cols, srcREM2, destREM2);
for (int i = 0; i < rows; i++)
{
for (int j = 0; j < cols; j++)
{
value[i][j] += mat1[i][j];
ch2.value[i][j] += mat2[i][j];
}
}
return;
}


As we can see crossover method contain two calls for GenerateMatrix method which randomly initialize sub matrix chromosome.

Mutation operation also depends of  GenerateMatrix because it randomly choose location which needs to be recreated. The following code contains two Mutate methods where the second one calls GenerateMatrix for subMatrix random generation.

///  Mutation based on Book Genetic Algorithm +Data Structure = Evolution Programs.
public void Mutate()
{
//choose random number of cols and rows
int locRows = Globals.radn.Next(1,rows);
int locCols = Globals.radn.Next(1,cols);
//define array for holding random indexs
var localRows = new int[locRows];
var localCols = new int[locCols];

//generate random source
int counter = 0;
while (true)
{
var num = Globals.radn.Next(rows);

var found = false;
for (var i = 0; i < localRows.Length; i++)
{
if (localRows[i] == num)
{
found = true;
break;
}
}
if (!found)
{
localRows[counter] = num;
counter++;
}
if (counter  == localRows.Length)
break;
}
//generate random destination
counter = 0;
while (true)
{
var num = Globals.radn.Next(cols);

var found = false;
for (var i = 0; i < localCols.Length; i++)
{
if (localCols[i] == num)
{
found = true;
break;
}
}
if (!found)
{
localCols[counter] = num;
counter++;
}
if (counter == localCols.Length)
break;
}
//perform mutation
Mutate(locRows, locCols, localRows, localCols, this);

}

internal static int[][] Mutate(int r, int c, int[] rs, int[] cs, GAMChromosome ch)
{
var source = new int[r];
var destination = new int[c];
//calculate Source for random submatrix
for (int i = 0; i < rs.Length; i++)
for (int j = 0; j < cs.Length; j++)
source[i]+= ch.value[rs[i]][cs[j]];

//calculate Destination for random submatrix
for (int i = 0; i < cs.Length; i++)
for (int j = 0; j < rs.Length; j++)
destination[i] += ch.value[rs[j]][cs[i]];

var subMatrix= GAMChromosome.GenerateMatrix(r, c, source, destination);

//merge generated submatrix to matrix
for (int i = 0; i < rs.Length; i++)
for (int j = 0; j < cs.Length; j++)
ch.value[rs[i]][cs[j]] = subMatrix[i][j];

return subMatrix;
}


This was the code implementation around Matrix based Chromosome representation for Genetic Algorithm. The full source code about it will be published with the first beta of GPdotNET v3.0. I can promisse that the first beta will be out for less that month.
Stay tuned

# GPdotNET vNext: Assignment and Transportation problems

GPdotNET vNext

Recently I wrote blog post  about implementation of TSP problem in GPdotNET. I got positive feedback about implementation, but there were people saying if can I implement more problems based on linear programming and optimization. So I have decided to implement Assignment and Transportation problems in GPdotNET. With existing 4,  the next version of GPdotNET 2.5 will contains in total the following problem domains:

1. Modelling discrete data set with GP,

2. Modelling and optimization discrete data set with combination of GP and GA,

3. Modelling Time series data set with GP,

4. Optimization of Analytic function with GA,

5. Traveling Salesman Problem with GA,

6. Assignment Problem with GA,

7. Transportation problem with GA.

In order to implement those problems it has to be implemented two new chromosome types:

1. Vector based GA chromosome for solving TSP and Assigment problem

2. Matrix based GA chromosome for solving Transport problem.

The first one (vector based chromosome type) is already implemented and needs to be modified for accepting different format data, in order to fully support TSP and Assignment problems at the same time. For second type there are several solutions for implementation for Matrix based chromosome, one of my favorite is published in the book “Genetic Algorithms + Data Structures = Evolution Programs”.

I can say this would be an exciting summer. :)

# Function optimization with Genetic Algorithm by using GPdotNET

Content

1. Introduction
2. Analytic function optimization module in GPdotNET
3. Examples of function optimizations
4. C# Implementation behind GPdotNET Optimization module

Introduction

GPdotNET is artificial intelligence tool for applying Genetic Programming and Genetic Algorithm in modeling and optimization of various engineering problems. It is .NET (Mono) application written in C# programming language which can run on both Windows and Linux based OS, or any OS which can run Mono framework. On the other hand GPdotNET is very easy to use. Even if you have no deep knowledge of GP and GA, you can apply those methods in finding solution. The project can be used in modeling any kind of engineering process, which can be described with discrete data. It is also be used in education during teaching students about evolutionary methods, mainly GP and GA. GPdotNET is open source project hosted at http://gpdotnet.codeplex.com

With releasing of  GPdotNET v2 it is also possible to find optimum for any analytic function regardless of independent variables. For example you can find optimum value for an analytically defined function with 2, 5, 10 or 100 independent variables. By using classic methods, function optimization of 3 or more independent variables is very difficult and sometimes impossible. It is also very hard to find optimum value for functions which are relatively complex regardless of number of independent variables.
Because GPdotNET is based on Genetic Algorithm we can find approximated optimum value of any function regardless of the limitation of number of independent variables, or complex definition. This blog post is going to give you a detailed and full description how to use GPdotNET to optimize function. Blog post will also cover C# implementation behind optimization proce by showing representation of Chromosome with real number, as well as Fitness calculation which is based on Genetic Programming tree expression. In this blog post it will also be presented several real world problem of optimization which will be solved with GPdotNET.

# Analitic Function Optimization Module in GPdotNET

When GPdotNET is opened you can choose several predefined and calucalted models from various domain problems, as weel as creating New model among other options. By choosing New model new dialog box appears like picture below.

By choosing Optimization of Analytic Function (see pic above) and pressing OK button, GPdotNET prepares model for optimization and opens 3 tab pages:

1. Analytic function,
2. Settings and
3. Optimize Model.

## Analytic function

By using “Analytic function” tab you can define expression of a function. More information about how to define mathematics expression of analytic function can be found on this blog post.

By using “Analytic definition tool” at the bottom of the page, it is possible to define analytic expression. Expression tree builder generates function in Genetic Programming Expression tree, because GPdotNET fully implements both methods. Sharing features of Genetic programming  in Optimization based Genetic Algorithm is unique and it is implement only in GPdotNET.

When the process of defining function is finished, press Finish button in order to proceed with further actions. Finish button action apply all changes with Optimization Model Tab. So if you have made some changed in function definition, by pressing Finish button changes will be send to optimization tab.
Defining expression of function is relatively simple, but it is still not natural way for defining function, and will be changed in the future. For example on picture 2, you can see Expression tree which represent:

$f(x,y)=y sin{4x}+1.1 x sin{2y}$.

## Setting GA parameters

The second step in optimization is setting Genetic Algorithm parameter which will be used in optimization process. Open the Setting tab and set the main GA parameters, see pic. 3.

To successfully applied GA in the Optimization, it is necessary to define:

1.  population size,
2. probabilities of genetic operators and
3. selection methods.

These parameters are general for all GA and GP models. More information about parameters you can find at https://bhrnjica.net/gpdotnet.

## Optimize model (running optimization)

When GA parameters are defined, we can start with optimization by selecting Optimization model tab. Before run, we have to define constrains for each independent variables. This is only limitation we have to define i  order to start optimization. The picture below shows how to define constrains in 3 steps:

1.  select row by left mouse click,
2. enter min and max value in text boxes
3. Press update button.

Perform these 3 actions for each independent variable defined in the function.

When the process of defining constrains is finished, it is time to run the calculation by pressing Optimize button, from the main toolbar(green button). During optimization process GPdotNET is presenting nice animation of fitness values, as well as showing current best optimal value. The picture above shows the result of optimization process with GPdotNET. It can be seen that the optimal value for this sample is $f_{opt}(9.96)=-100.22$.

# Examples of function optimization

In this topic we are going to calculate optimal value for some functions by using GPdotNET. Zo be prove that the optimal value is correct or very close to correct value we will use Wolfram Alpha or other method.

### Function: x sin(4x)+1.1 x sin(2y)

GP Expression tree looks like the following picture (left size):

Optimal value is found (right above picture) for 0.054 min, at 363 generation of total of 500 generation. Optimal value is f(8.66,9.03)=-18.59.

Here is Wolfram Alpha calculation of the same function. http://www.wolframalpha.com/input/?i=min+x*sin%284*x%29%2B+1.1+*y*+sin%282+*y%29%2C+0%3Cx%3C10%2C0%3Cy%3C10

### Function:  (x^2+x)cos(x),  -10≤x≤10

GP expression tree looks like the following picture (left size):

Optimal value is found for 0.125 min, at 10 generation of total of 500 generation. Optimal value is F(9.62)=-100.22.

Here is Wolfram Alpha calculation of the same function. http://www.wolframalpha.com/input/?i=minimum+%28x%5E2%2Bx%29*cos%28x%29+over+%5B-10%2C10%5D

### Easom’s function fEaso(x1,x2)=-cos(x1)•cos(x2)•exp(-((x1-pi)^2+(x2-pi)^2)), -100<=x(i)<=100, i=1:2.

GP expression tree looks like the following picture (left size):

Optimal value is found for 0.061 min, at 477 generation of total of 500 generation. Optimal value is F(9.62)=-1, for x=y=3.14.

Function can be seen at this MatLab link.

# C# Implementation behind GPdotNET Optimization module

GPdotNET Optimization module is just a part which is incorporated in to GPdotNET Engine. Specific implementation for this module is Chromosome implementation, as well as Fitness function. Chromosome implementation is based on  floating point value instead of classic binary representation. Such a Chromosome contains array of floating point values and each element array represent function independent variable. If the function contains two independent variables (x,y) chromosome implementation will contains array with two floating points. Constrains of chromosome values represent constrains we defined during settings of the optimization process. The following source code listing shows implementation of GANumChrosomome class in GPdotNET:

public class GANumChromosome: IChromosome
{
private double[] val = null;
private float fitness = float.MinValue;
//... rest of implementation
}


When the chromosome is generated array elements get values randomly generated between min and max value defined by function definition. Here is a source code of Generate method.

///
/// Generate values for each represented variable
///
public void Generate(int param = 0)
{
if(val==null)
val = new double[functionSet.GetNumVariables()];
else if (val.Length != functionSet.GetNumVariables())
val = new double[functionSet.GetNumVariables()];

for (int i = 0; i < functionSet.GetNumVariables(); i++)
val[i] = Globals.radn.NextDouble(functionSet.GetTerminalMinValue(i), functionSet.GetTerminalMaxValue(i));

}


Mutation is accomplish when randomly chosen array element randomly change itc value. Here is a listing:

///
///  Select array element randomly and randomly change itc value
///
public void Mutate()
{
//randomly select array element
int crossoverPoint = Globals.radn.Next(functionSet.GetNumVariables());
//randomly generate value for the selected element
val[crossoverPoint] = Globals.radn.NextDouble(functionSet.GetTerminalMinValue(crossoverPoint), functionSet.GetTerminalMaxValue(crossoverPoint));
}


Crossover is little bit complicated. It is implemented based on Book Practical Genetic Algoritms see pages 56,57,48,59. Here is an implementation:

///
/// Generate number between 0-1.
/// For each element array of two chromosomes exchange value based on random number.
///
///
public void Crossover(IChromosome ch2)
{
GANumChromosome p = (GANumChromosome)ch2;
int crossoverPoint = Globals.radn.Next(functionSet.GetNumVariables());
double beta;
for (int i = crossoverPoint; i < functionSet.GetNumVariables(); i++)
{
val[i] = val[i] - beta * (val[i] - p.val[i]);
p.val[i] = p.val[i] + beta * (val[i] - p.val[i]);
}
}


Fitness function for Optimization is straightforward, it evaluates each chromosome against tree expression. For minimum the better chromosome is lower value. For maximum better chromosome is the chromosome with higher fitness value. Here is a implementation of Optimizatio Fitness function:

///
/// Evaluates function agains terminals
///
///
///
///
public float Evaluate(IChromosome chromosome, IFunctionSet functionSet)
{
GANumChromosome ch = chromosome as GANumChromosome;
if (ch == null)
return 0;
else
{
//prepare terminals
var term = Globals.gpterminals.SingleTrainingData;
for (int i = 0; i < ch.val.Length; i++)
term[i] = ch.val[i];

var y = functionSet.Evaluate(_funToOptimize, -1);

if (double.IsNaN(y) || double.IsInfinity(y))
y = float.NaN;

//Save output in to output variable
term[term.Length - 1] = y;

if (IsMinimize)
y *= -1;

return (float)y;
}
}


# Summary

We have seen that Function optimization module within GPdotNET is powerful optimization tool. It can find pretty close solution for very complex functions regardless of number of independent variables. Optimization module use Genetic Algorithm method with floating point value chromosome representation described in several books about GA. It is fast, simple and can be used in education as well as in solving real problems. More info about GPdotNET can be found at https://bhrnjica.net/gpdotnet.