# Blog Archives

## Introduction

This document presents Skrgic Selection Method, the one of several selection methods in GPdotNET. While I was developing GPdotNET I have had several conversation with my friend Fikret Skrgic (master in computer scinence and math) regarding selection in Evolutionary algorithms. He is an incredible man, and in just a few minutes while I was describing to him what I want from a new way of selection, he came out with the basic idea of the new selection method. After that time in just a few mails I had a new selection method ready for implementation in GPdotNET. I gave the name to the new selection method by his surname. It is a little thankfulness to him.

In the flowing text it will be presented the idea behind Skrgic Selection.

## Liner Skrgic Selection (LSS)

The Idea behind this selection is based on chromosome fitness. The rule is simple: The bigger chromosome fitness gives bigger chance to select better chromosome. In Population, chromosomes are already sorted from best to worst chromosomes.
The process of selection is the following:

1. Find the maximum and minimum fitness value from the population.
2. The best chromosome (maximum fitness) has position 0 in the population (zero based index), and the worst chromosome has index $N-1$.
3. Let’s give them name as $f_{max}$ and $f_{min}$ respectively.
4. Choose random number $r$ between $f_{min}$ and $f_{max}$.
5. Choose random chromosome from the Population $chrom$.
6. Compare the values of $r$ and $f_{chrom}$. If the $f_{chrom}$ is greater than $r$, select $chrom$.
7. Repeat steps 3 and 4 until you select one chromosome.
8. Repeat steps 2,3,4,5 until you select $n$ chromosomes.

The following picture shows graph of linear Skrgic selection:

Linear Skrgic Selection (LSS)

## Nonlinear Skrgic Selection (NSS)

In LSS we have linear graph of selection probability. This means that if we have chromosome with $f_{chrom}$ fitness value, and another chromosome with $\frac{f_{chrom}}{2}$ fitness value, the probability of selection of two chromosomes is $p$ and $\frac{p}{2}$ respectively. This means that probability of selection is growing linearly.
If we want to change probability between chromosomes we can define a factor $k$ to be like selection pressure on whole chromosomes in the population. So let the k be a real value, and define the fitness of each chromosome as:

$f_{nss}=f_{chrom} (1+k \frac{f_{chrom}}{f_{max}})$,

where:

• $f_{chrom}$ – fitness value of the chromosome,
• $f_{max}$ – maximum fitness value(fitness of the best chromosome of the population),
• $k$ – selection pressure,
• $f_{nss}$ – nonlinear fitness value of the chromosome.

We can conclude that the maximum nonlinear fitness value of the best chromosome is given by:

$f_{max(nss)}=f_{max}(1+k)$.

For various value of parameter $k$  we can define graph of probability selection like on the following picture:

Graph for various sample of NonLinear Skrgic Selection (NLSS)

On the picture above we can see if $k=0$ we get the standard Linear Skrgic Selection (LSS). We can also conclude that there is a fitness value when the probability of selection is constant and not depends of parameter $k$. Very interesting graph is when the $k=-1$. In this case the selection probability of the worst and the best chromosome are equal.

GPdotNET settings of GP parameters

## Yet another scientific work based on GPdotNET

University of Bihać will organize 8th Scientific Conference about Development and Revitalization of production (RIM 2011) this week. I am also on this conference with my work about Modeling Hardness of steel weld by genetic programming, which is based on my genetic programming tool GPdotNET. I am very proud to announce that this is my second work based on the GPdotNET. After my work about modeling and optimization of Drilling process by Evolutionary algorithms from 2007, this time it is about modeling steel weld. This work is introduction of my Master Thesis which is going to be out soon :).
The presentation will be at Saturday at 9:30. More abut the conference can be found at http://www.rim.tfb.ba.
Meet you there.

## The New version of GPdotNET is coming soon

The new stable release of the GPdotNET will be built on .NET 4.0, so the .NET 3.5 will not be supported at all. The reason for that is the compatibility with ParallelFX. From .NET 4.0 ParallelFX is integral part of the .NET 4.0 System namespace.

Some of the new feature:

1. Localization –The application will be localized on English and Bosnian language.

2. Improvements of the selection engine. Current 4 selections (Elite, Rank, Roulette-Wheel and Tournament) are optimized and rewritten.

a. There are 3 new selections:

i. Stochastic Uniform selection SUS
ii. Fitness Uniform Selection Scheme FUSS
iii. Skrgic Selection Scheme SSS

3. Parallel processing is also optimized but it is still not complete.

4. Fitness measure is expanded to over 15 types (similar to GeneExproTools 4.0):

1. Absolute Error with Selection Range
2. Absolute/Hits
3. MSE (mean squared error)
4. RMSE (root mean squared error)
5. MAE (mean absolute error)
6. RSE (relative squared error)
7. RRSE (root relative squared error)
8. RAE (relative absolute error)
9. Relative Error with Selection Range
10. Relative/Hits
11. rMSE (relative MSE)
12. rRMSE (relative RMSE)
13. rMAE (relative MAE)
14. rRSE (relative RSE)
15. rRRSE (relative RRSE)
16. rRAE (relative RAE)
17. R-square
18. Correlation Coefficient

5. Export results to Excel for further analysis

Screenshots:

Figure 1 Dialog for creating a model 1. Data model, 2. Time series model

Figure 2 Excel definition of the function, for Excel result export

Figure 3: 7 Selection methods

Figure 4: Over 15 Fitness type of measures

Figure 5: Export Options

## GEALIB – aplikacija za modeliranje i optimizaciju evolucijskim algoritmima

Napomena: Juni 2010. Ovaj tekst predstavlja početak razvoja aplikacije koja je poslije prerasla u GPdotNET open source projekat.
Dio rada iz predanog u maju 2008 godine sa postdiplomskog studija, vezano za aplikaciju za modeliranje i optimizaciju evolucijskim algoritmima.

U pripremi Seminarskih radova na postdiplomskom studiju, za obradu eksperimentalnih podataka procesa bušenja kao i eksplozivne obrade, razvijana je .NET biblioteka napisana u programskom jeziku C# za primjenu evolucijskih algoritama u obradnim procesima.

Aplikacija podržava proizvoljan broj nezavisnih ulaznih varijabli Xi i jednu zavisno promjenjivu izlaznu varijablu Y.

Slika 14: Osnovni Izgled GEALIB aplikacije

#### Učitavanje eksperimentalnih podataka

Početak rada u aplikaciji počinje učitavanjem exsperimentalnih podataka iz datoteke. Aplikacija podržava CSV (Comma Separated Value) extenziju. Učitavanje podataka počinje klikom da dugme „Učitaj…“.

Nakon učitavanja podataka podaci se raspoređuju dijagramski te tabelarno kao na slici 13. U dijagramu su prikazan vrijednosti preko broja pokusa (x osa) i vrijednosti izlazne promjenjive (y- osa).

#### Podešavanje parametara genetskog programiranja

Nakon učitavanja eksperimentalnih podataka, počinje proces podešavanja GP parametara. Klikom da Tab „Podešavanje“, pojavljuje se dijaloški okvir za podašavanje osnovnih parametara GP.

Slika 15: Parametri GP

Osnovni parametri GP su:

1. Veličina populacije – broj hromosoma u populaciji

2. Maksimalna dubina drveta hromosoma pri inicijalizaciji, ukrštanju i mutaciji

3. Vjerojatnost operacija za ukršatanje, mutaciji i reprodukciju

4. Kriterij odabira – najvažniji parametar GP. Pored višestruke regresije GEALIB podržava i ostale kriterije.

Metode selekcije mogu biti: Elite,Rank, Roulet

5. Metoda inicijalizacije hromosoma: Puna, Rastuća i Miješana.

#### Izbor Terimana i Funkcija u GP

Nakon definisanja osnovnih parametara GP, slijedi izbor terminala i funkcija u GP. Naredna slika pokazuje dostupne funkcije u GEALIB.

Slika 16: Definisanje Funkcija i Terminala u GEALIB

Pored skupa nezavisno promjenjivih Xi u Terminale spadaju i slučajno generirane konstante koje se u ovom fazi podašavanja GP definišu. Sa slike se može vidjeti da se konstante mogu generisati iz proizvoljnog intervala, kao broj konstanti koji se generiše.

#### Pokretanje GP

Pokretanje GP započinje definisanje broja evolucija, kroz koje će GP proći dok ne završi proces. Proces se može i prije završiti ako se dobije tačan model. Startanje GP implementirano je na način da se GP pokreće u pozadinskoj niti, dok glavna nit programa ostaje slobodna i korisnik može prelaziti iz tabulatora.

Dodatno je implementirana i opcija nasilnog zaustavljanja GP, koja poslije zaustavljanja izračunava najbolji horomosom koji je izračunat i prikazuje ga.

Slika 17: GEALIB u izračunavanju modela

Tokom izračunavanja modela, desni dijagram prikazuje simulaciju najboljih rješenja dobijenih tokom evolucije. U donjem dijelu je progres kontrola za prikaz statusa GP.

Kada aplikacija postigne tačan rezultat ili maksimalan broj iteracija (evolucija) zaustavlja se GP i prikazje se najbolje dobijeno rješenje.

U tabulatoru „Rezultati“ prikazano je najbolje rješenje kao i tabelarna komparacija eksperimentalnih rezultata i rezulatata dobijenog pomoću GP. U ovoj fazu GP završava svoj proce i počinje proces optimizacije dobijenog modela pomoću Genetskog Algoritma.

Dobijeni model prikazan je u notaciji koji je kompatibilan sa MS Excel, kao i MS Word, tako da se jednostavni kopiranje model može importovati u Excel za daljnju obradu i preračunavanje modela.

Slika 18: Prikaz modela i komaparacija rezulatata

#### Optimizacija GP modela pomoću GA

Nakon dobijanja GP modela nastupa faza optimizacije modela pomoću GA. Donja slika prikazuje interfejs za optimizaciju. Slično kao i kod GP prolazi se proce definisanja parametara GA i pokretanje GA.

Slika 19: Podešavanje parametrara GA za optimizaciju

Aplikacija je testirana na bezbroj primjera kako analitičkih funkcija tako i eksperimantalnih podataka. Sadrži sve opcije i parametre klasične metode genetskog programiranja i optimizacije genetskim algoritmom koji ima mogućnost reprezentacije pomoću binarnog, cjelobrojnog i realnog broja.

Aplikacija radi sa proizvoljnim brojem nezavisnih varijabli i samo jednom zavisno promjenjivom.