# Descriptive statistics and data normalization with CNTK and C#

As you probably know CNTK is Microsoft Cognitive Toolkit for deep learning. It is open source library which is used by various Microsoft products. Also the CNTK is powerful library for developing custom ML solutions from various fields with different platforms and languages. What is also so powerful in the CNTK is the way of the implementation. In fact the library is implemented as series of computation graphs, which  is fully elaborated into the sequence of steps performed in a deep neural network training.

Each CNTK compute graph is created with set of nodes where each node represents numerical (mathematical) operation. The edges between nodes in the graph represent data flow between operations. Such a representation allows CNTK to schedule computation on the underlying hardware GPU or CPU. The CNTK can dynamically analyze the graphs in order to to optimize both latency and efficient use of resources. The most powerful part of this is the fact thet the CNTK can calculate derivation of any constructed set of operations, which can be used for efficient learning  process of the network parameters. The flowing image shows the core architecture of the CNTK.

On the other hand, any operation can be executed on CPU or GPU with minimal code changes. In fact we can implement method which can automatically takes GPU computation if available. The CNTK is the first .NET library which provide .NET developers to develop GPU aware .NET applications.

What this exactly mean is that with this powerful library you can develop complex math computation directly to GPU in .NET using C#, which currently is not possible when using standard .NET library.

For this blog post I will show how to calculate some of basic statistics operations on data set.

Say we have data set with 4 columns (features) and 20 rows (samples). The C# implementation of this 2D array is show on the following code snippet:

static float[][] mData = new float[][] {
new float[] { 5.1f, 3.5f, 1.4f, 0.2f},
new float[] { 4.9f, 3.0f, 1.4f, 0.2f},
new float[] { 4.7f, 3.2f, 1.3f, 0.2f},
new float[] { 4.6f, 3.1f, 1.5f, 0.2f},
new float[] { 6.9f, 3.1f, 4.9f, 1.5f},
new float[] { 5.5f, 2.3f, 4.0f, 1.3f},
new float[] { 6.5f, 2.8f, 4.6f, 1.5f},
new float[] { 5.0f, 3.4f, 1.5f, 0.2f},
new float[] { 4.4f, 2.9f, 1.4f, 0.2f},
new float[] { 4.9f, 3.1f, 1.5f, 0.1f},
new float[] { 5.4f, 3.7f, 1.5f, 0.2f},
new float[] { 4.8f, 3.4f, 1.6f, 0.2f},
new float[] { 4.8f, 3.0f, 1.4f, 0.1f},
new float[] { 4.3f, 3.0f, 1.1f, 0.1f},
new float[] { 6.5f, 3.0f, 5.8f, 2.2f},
new float[] { 7.6f, 3.0f, 6.6f, 2.1f},
new float[] { 4.9f, 2.5f, 4.5f, 1.7f},
new float[] { 7.3f, 2.9f, 6.3f, 1.8f},
new float[] { 5.7f, 3.8f, 1.7f, 0.3f},
new float[] { 5.1f, 3.8f, 1.5f, 0.3f},};


If you want to play with CNTK and math calculation you need some knowledge from Calculus, as well as vectors, matrix and tensors. Also in CNTK any operation is performed as matrix operation, which may simplify the calculation process for you. In standard way, you have to deal with multidimensional arrays during calculations. As my knowledge currently there is no .NET library which can perform math operation on GPU, which constrains the .NET platform for implementation of high performance applications.

If we want to compute average value, and standard deviation for each column, we can do that with CNTK very easy way. Once we compute those values we can used them for normalizing the data set by computing standard score (Gauss Standardization).

The Gauss standardization is calculated by the flowing term:

$nValue= \frac{X-\nu}{\sigma}$,
where X- is column values, $\nu$ – column mean, and $\sigma$– standard deviation of the column.

For this example we are going to perform three statistic operations,and the CNTK automatically provides us with ability to compute those values on GPU. This is very important in case you have data set with millions of rows, and computation can be performed in few milliseconds.

Any computation process in CNTK can be achieved in several steps:

1. Read data from external source or in-memory data,
2. Define Value and Variable objects.
3. Define Function for the calculation
4. Perform Evaluation of the function by passing the Variable and Value objects
5. Retrieve the result of the calculation and show the result.

All above steps are implemented in the following implementation:

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;
using CNTK;
namespace DataNormalizationWithCNTK
{
class Program
{
static float[][] mData = new float[][] {
new float[] { 5.1f, 3.5f, 1.4f, 0.2f},
new float[] { 4.9f, 3.0f, 1.4f, 0.2f},
new float[] { 4.7f, 3.2f, 1.3f, 0.2f},
new float[] { 4.6f, 3.1f, 1.5f, 0.2f},
new float[] { 6.9f, 3.1f, 4.9f, 1.5f},
new float[] { 5.5f, 2.3f, 4.0f, 1.3f},
new float[] { 6.5f, 2.8f, 4.6f, 1.5f},
new float[] { 5.0f, 3.4f, 1.5f, 0.2f},
new float[] { 4.4f, 2.9f, 1.4f, 0.2f},
new float[] { 4.9f, 3.1f, 1.5f, 0.1f},
new float[] { 5.4f, 3.7f, 1.5f, 0.2f},
new float[] { 4.8f, 3.4f, 1.6f, 0.2f},
new float[] { 4.8f, 3.0f, 1.4f, 0.1f},
new float[] { 4.3f, 3.0f, 1.1f, 0.1f},
new float[] { 6.5f, 3.0f, 5.8f, 2.2f},
new float[] { 7.6f, 3.0f, 6.6f, 2.1f},
new float[] { 4.9f, 2.5f, 4.5f, 1.7f},
new float[] { 7.3f, 2.9f, 6.3f, 1.8f},
new float[] { 5.7f, 3.8f, 1.7f, 0.3f},
new float[] { 5.1f, 3.8f, 1.5f, 0.3f},};
static void Main(string[] args)
{
//define device where the calculation will executes
var device = DeviceDescriptor.UseDefaultDevice();

//print data to console
Console.WriteLine($"X1,\tX2,\tX3,\tX4"); Console.WriteLine($"-----,\t-----,\t-----,\t-----");
foreach (var row in mData)
{
Console.WriteLine($"{row[0]},\t{row[1]},\t{row[2]},\t{row[3]}"); } Console.WriteLine($"-----,\t-----,\t-----,\t-----");

//convert data into enumerable list
var data = mData.ToEnumerable<IEnumerable<float>>();

//assign the values
var vData = Value.CreateBatchOfSequences<float>(new int[] {4},data, device);
//create variable to describe the data
var features = Variable.InputVariable(vData.Shape, DataType.Float);

//define mean function for the variable
var mean =  CNTKLib.ReduceMean(features, new Axis(2));//Axis(2)- means calculate mean along the third axes which represent 4 features

//map variables and data
var inputDataMap = new Dictionary<Variable, Value>() { { features, vData } };
var meanDataMap = new Dictionary<Variable, Value>() { { mean, null } };

//mean calculation
mean.Evaluate(inputDataMap,meanDataMap,device);
//get result
var meanValues = meanDataMap[mean].GetDenseData<float>(mean);

Console.WriteLine($""); Console.WriteLine($"Average values for each features x1={meanValues[0][0]},x2={meanValues[0][1]},x3={meanValues[0][2]},x4={meanValues[0][3]}");

//Calculation of standard deviation
var std = calculateStd(features);
var stdDataMap = new Dictionary<Variable, Value>() { { std, null } };
//mean calculation
std.Evaluate(inputDataMap, stdDataMap, device);
//get result
var stdValues = stdDataMap[std].GetDenseData<float>(std);

Console.WriteLine($""); Console.WriteLine($"STD of features x1={stdValues[0][0]},x2={stdValues[0][1]},x3={stdValues[0][2]},x4={stdValues[0][3]}");

//Once we have mean and std we can calculate Standardized values for the data
var gaussNormalization = CNTKLib.ElementDivide(CNTKLib.Minus(features, mean), std);
var gaussDataMap = new Dictionary<Variable, Value>() { { gaussNormalization, null } };
//mean calculation
gaussNormalization.Evaluate(inputDataMap, gaussDataMap, device);

//get result
var normValues = gaussDataMap[gaussNormalization].GetDenseData<float>(gaussNormalization);
//print data to console
Console.WriteLine($"-------------------------------------------"); Console.WriteLine($"Normalized values for the above data set");
Console.WriteLine($""); Console.WriteLine($"X1,\tX2,\tX3,\tX4");
Console.WriteLine($"-----,\t-----,\t-----,\t-----"); var row2 = normValues[0]; for (int j = 0; j < 80; j += 4) { Console.WriteLine($"{row2[j]},\t{row2[j + 1]},\t{row2[j + 2]},\t{row2[j + 3]}");
}
Console.WriteLine($"Model Expectation: Input({xVal[0]},{xVal[1]},{xVal[2]},{xVal[3]}), Iris Flower= setosa");  ## Training previous saved model Training previously saved model is very simple, since it requires no special coding. Right after the trainer is created with all necessary stuff (network, learning rate, momentum and other), you just need to call  trainer.RestoreFromCheckpoint(strIrisFilePath);  No additional code should be added. The above method is called, after you successfully saved the model state by calling trainer.SaveCheckpoint(strIrisFilePath);  The method is usually called at the end of the training process. Complete source code from this blog post can be found here. # How to setup learning rate per iteration in CTNK using C# So far we have seen how to train and validate models in CNTK using C#. Also there many more details which should be revealed in order to better understand the CNTK library. One of the important feature not only in the CNTK but also in every DNN (deep neural networks) is the learning rate. In ANN the learning rate is the number by which the derivative is multiply before it is subtracted by the weight. If the weight is decreased to much the loss function will be increased and the network will diverge. On the other hand if the weight is decreased to little the loss function will be changed little and the diverge progress will be to slow. So selecting the right value of the parameter is important. During the training process, the learning rate is usually defined as constant value. In CNTK the learning rate is defined as follow: // set learning rate for the network var learningRate = new TrainingParameterScheduleDouble(0.2, 1);  From the code above the learning rate is assign to 0.2 value per sample. This means whole training process will be done with the learning rate of 0.2. The CNTK support dynamic changing of the learning rate. Assume we want to setup different the learning rates so that from the fist to the 100 iterations the learning rate would be 0.2. From the 100 to 500 iterations we want the learning rate would be 0.1. Moreover, after the 500 iterations are completed and to he end of the iteration process, we want to setup the learning rate to 0.05. Above said can be expressed: lr1=0.2 , from 1 to 100 iterations lr2= 0.1 from 100 to 500 iterations lr3= 0.05 from 500 to the end of the searching process.  In case we want to setup the learning rate dynamically we need to use the PairSizeTDouble class in order to defined the learning rate. So for the above requirements the flowing code should be implemented: PairSizeTDouble p1 = new PairSizeTDouble(2, 0.2); PairSizeTDouble p2 = new PairSizeTDouble(10, 0.1); PairSizeTDouble p3 = new PairSizeTDouble(1, 0.05); var vp = new VectorPairSizeTDouble() { p1, p2, p3 }; var learningRatePerSample = new CNTK.TrainingParameterScheduleDouble(vp, 50);  First we need to defined PairSizeTDouble object for every learning rate value, with the integer number which will be multiply. Once we define the rates, make a array of rate values by creating the VectorPairSizeTDouble object. Then the array is passed as the first argument in the TrainingParameterScheduleDouble method. The second argument of the method is multiplication number. So in the first rate value, the 2 is multiple with 50 which is 100, and denotes the iteration number. Similar multiplication are done in the other rate values. # Testing and Validation CNTK models using C# …continue from the previous post. Once the model is build and Loss and Validation functions are satisfied our expectation, we need to validate and test the model using the data which was not part of the training data set (unseen data). The model validation is very important because we want to see if our model is trained well,so that can evaluates unseen data approximately same as the training data. Otherwise the model which cannot predict the output is called overfitted model. Overfitting can happen when the model was trained long enough that shows very high performance for the training data set, but for the testing data evaluate bad results. We will continue with the implementation from the prevision two posts, and implement model validation. After the model is trained, the model and the trainer are passed to the Evaluation method. The evaluation method loads the testing data and calculated the output using passed model. Then it compares calculated (predicted) values with the output from the testing data set and calculated the accuracy. The following source code shows the evaluation implementation. private static void EvaluateIrisModel(Function ffnn_model, Trainer trainer, DeviceDescriptor device) { var dataFolder = "Data";//files must be on the same folder as program var trainPath = Path.Combine(dataFolder, "testIris_cntk.txt"); var featureStreamName = "features"; var labelsStreamName = "label"; //extract features and label from the model var feature = ffnn_model.Arguments[0]; var label = ffnn_model.Output; //stream configuration to distinct features and labels in the file var streamConfig = new StreamConfiguration[] { new StreamConfiguration(featureStreamName, feature.Shape[0]), new StreamConfiguration(labelsStreamName, label.Shape[0]) }; // prepare testing data var testMinibatchSource = MinibatchSource.TextFormatMinibatchSource( trainPath, streamConfig, MinibatchSource.InfinitelyRepeat, true); var featureStreamInfo = testMinibatchSource.StreamInfo(featureStreamName); var labelStreamInfo = testMinibatchSource.StreamInfo(labelsStreamName); int batchSize = 20; int miscountTotal = 0, totalCount = 20; while (true) { var minibatchData = testMinibatchSource.GetNextMinibatch((uint)batchSize, device); if (minibatchData == null || minibatchData.Count == 0) break; totalCount += (int)minibatchData[featureStreamInfo].numberOfSamples; // expected labels are in the mini batch data. var labelData = minibatchData[labelStreamInfo].data.GetDenseData<float>(label); var expectedLabels = labelData.Select(l => l.IndexOf(l.Max())).ToList(); var inputDataMap = new Dictionary<Variable, Value>() { { feature, minibatchData[featureStreamInfo].data } }; var outputDataMap = new Dictionary<Variable, Value>() { { label, null } }; ffnn_model.Evaluate(inputDataMap, outputDataMap, device); var outputData = outputDataMap[label].GetDenseData<float>(label); var actualLabels = outputData.Select(l => l.IndexOf(l.Max())).ToList(); int misMatches = actualLabels.Zip(expectedLabels, (a, b) => a.Equals(b) ? 0 : 1).Sum(); miscountTotal += misMatches; Console.WriteLine($"Validating Model: Total Samples = {totalCount}, Mis-classify Count = {miscountTotal}");

if (totalCount >= 20)
break;
}
Console.WriteLine($"---------------"); Console.WriteLine($"------TESTING SUMMARY--------");
float accuracy = (1.0F - miscountTotal / totalCount);
Console.WriteLine($"Model Accuracy = {accuracy}"); return; }  The implemented method is called in the previous Training method.  EvaluateIrisModel(ffnn_model, trainer, device);  As can be seen the model validation has shown that the model predicts the data with high accuracy, which is shown on the following picture. This was the latest post in series of blog posts about using Feed forward neural networks to train the Iris data using CNTK and C#. The full source code for all three samples can be found here. # Train Iris data by Batch using CNTK and C# In the previous post we have seen how to train NN model by using MinibatchSource. Usually we should use it when we have large amount of data. In case of small amount of the data, all data can be loaded in memory, and all can be passed to each iteration in order to train the model. This blog post will implement this kind of feeding the trainer. We will reused the previous implementation, so the starting point can be previous source code. For data loading we have to define a new method. The Iris data is stored in text format like the following: sepal_length,sepal_width,petal_length,petal_width,species 5.1,3.5,1.4,0.2,setosa(1 0 0) 7.0,3.2,4.7,1.4,versicolor(0 1 0) 7.6,3.0,6.6,2.1,virginica(0 0 1) ...  The output column is encoded to 1-N-1 encoding rule we have seen previously. The method will read all the data from the file, parse the data and create two float arrays: • float[] feature, and • float[] label. As can be seen both arrays are 1D, which means all data will be inserted in 1D, because the CNTK requires so. Since the data is in 1D array, we should also provide the dimensionality of the data so te CNTK can resolve what values for each features. The following listing shows the loading Iris data in two 1D array returned as tuple. static (float[], float[]) loadIrisDataset(string filePath, int featureDim, int numClasses) { var rows = File.ReadAllLines(filePath); var features = new List<float>(); var label = new List<float>(); for (int i = 1; i < rows.Length; i++) { var row = rows[i].Split(','); var input = new float[featureDim]; for (int j = 0; j < featureDim; j++) { input[j] = float.Parse(row[j], CultureInfo.InvariantCulture); } var output = new float[numClasses]; for (int k = 0; k < numClasses; k++) { int oIndex = featureDim + k; output[k] = float.Parse(row[oIndex], CultureInfo.InvariantCulture); } features.AddRange(input); label.AddRange(output); } return (features.ToArray(), label.ToArray()); }  Once the data is loaded we should change very little amount of the previous code in order to implement batching instead of using minibatchSource. At the beginning we provides several variable to define the NN model structure. Then we call the loadIrisDataset, and define xValues and yValues, which we use in order to create feature and label input variables. Then we create dictionary which connect the feature and labels with data values which we will pass to the trainer later. The next code is the same as in the previous version in order to create NN model, Loss and Evaluation functions, and learning rate. Then we create loop, for 800 iteration. Once the iteration reaches the maximum value the program outputs the model properties and terminates. Above said it implemented in the following code. public static void TrainIriswithBatch(DeviceDescriptor device) { //data file path var iris_data_file = "Data/iris_with_hot_vector.csv"; //Network definition int inputDim = 4; int numOutputClasses = 3; int numHiddenLayers = 1; int hidenLayerDim = 6; int sampleSize = 130; //load data in to memory var dataSet = loadIrisDataset(iris_data_file, inputDim, numOutputClasses); // build a NN model //define input and output variable var xValues = Value.CreateBatch<float>(new NDShape(1, inputDim), dataSet.Item1, device); var yValues = Value.CreateBatch<float>(new NDShape(1, numOutputClasses), dataSet.Item2, device); // build a NN model //define input and output variable and connecting to the stream configuration var feature = Variable.InputVariable(new NDShape(1, inputDim), DataType.Float); var label = Variable.InputVariable(new NDShape(1, numOutputClasses), DataType.Float); //Combine variables and data in to Dictionary for the training var dic = new Dictionary<Variable, Value>(); dic.Add(feature, xValues); dic.Add(label, yValues); //Build simple Feed Froward Neural Network model // var ffnn_model = CreateMLPClassifier(device, numOutputClasses, hidenLayerDim, feature, classifierName); var ffnn_model = createFFNN(feature, numHiddenLayers, hidenLayerDim, numOutputClasses, Activation.Tanh, "IrisNNModel", device); //Loss and error functions definition var trainingLoss = CNTKLib.CrossEntropyWithSoftmax(new Variable(ffnn_model), label, "lossFunction"); var classError = CNTKLib.ClassificationError(new Variable(ffnn_model), label, "classificationError"); // set learning rate for the network var learningRatePerSample = new TrainingParameterScheduleDouble(0.001125, 1); //define learners for the NN model var ll = Learner.SGDLearner(ffnn_model.Parameters(), learningRatePerSample); //define trainer based on ffnn_model, loss and error functions , and SGD learner var trainer = Trainer.CreateTrainer(ffnn_model, trainingLoss, classError, new Learner[] { ll }); //Preparation for the iterative learning process //used 800 epochs/iterations. Batch size will be the same as sample size since the data set is small int epochs = 800; int i = 0; while (epochs > -1) { trainer.TrainMinibatch(dic, device); //print progress printTrainingProgress(trainer, i++, 50); // epochs--; } //Summary of training double acc = Math.Round((1.0 - trainer.PreviousMinibatchEvaluationAverage()) * 100, 2); Console.WriteLine($"------TRAINING SUMMARY--------");
Console.WriteLine(\$"The model trained with the accuracy {acc}%");
}


If we run the code, the output will be the same as we got from the previous blog post example: