Building a simple neural network in C#

Leonardo Schmitt Alves
Analytics Vidhya
Published in
6 min readDec 5, 2019

How hard is to build a simple neural network with C# without the support of an AI library? Let’s find out…

So why I am doing it?

I am a .NET guy with background on this language going back to 2012. But I never really try to build a AI agent with C#, the most near I got from this, was when I implemented a GLPK solver a linear optimization problem to one of my past jobs, we can talk about that in another moment.

And also, one of my professor at the colleges ask for it, so here I am…

First I want to say thanks to Milo Spencer-Harper by the article How to build a simple neural network in 9 lines of Python code. I used this article as a guide during this quest, and I really recommend the reading of this article before you start this one.

A little overview of neural networks

This AI approach tries to mimic the way that a natural brains process information to generate actions. So the ANN tries to replicate a brain neuron and synapses.

The neurons, in an artificial neural network, are organized in weighted graph where each node is a neuron and the weighted branches represents the synapses.

The ANN are composed by layers of neurons, it could be the input layer , the hidden layer and the output layer.

https://www.innoarchitech.com/blog/artificial-intelligence-deep-learning-neural-networks-explained

The input layer are the sensors of the agent, they will percept the environment. A ANN agent will have just one input layer, but it’s possible to create interconnected agents, connecting the output of one agent on the input of another one.

The hidden layer will process the data sent by the input layer, one agent can have many hidden layers as it needs.

And the output layer will generate our result.

The neurons

To have a better understand of how the ANN works we need to go more deep in the functioning of the neurons.

https://www.innoarchitech.com/blog/artificial-intelligence-deep-learning-neural-networks-explained

All the neurons uses the concept of the image above.

  1. Each one of the synapse (represented by X1…Xn) have a weight (represented by W1…Wn).
  2. The neuron will multiply the weights of all the synapses with the values of the inputs and then sum them all.
  3. A bias value (represented by b) is used to delay the activation of the neuron. It’s kind of a minimum value to fire the threshold.
  4. The summed value of the synapses + the bias (represented by Z) will be used on the activation function.
  5. The activation function is used to normalize the result of the sum in to a simple 1 and 0 result, 1=active 0=not-active
  6. The result of the activation function will determined if the neuron will active or not.
  7. If the neuron get active, another neuron can receive another synapse or if this is the output layer, we have our result.

The activation function

Most the neural network algorithms examples will be using the sigmoid as activation function.

But why?

The main reason why we use sigmoid function is because it exists between (0 to 1). Therefore, it is especially used for models where we have to predict the probability as an output.Since probability of anything exists only between the range of 0 and 1, sigmoid is the right choice.

I took this text from this great article writing by SAGAR SHARMA, you can find the full article here.

The training steps

  1. Before the training start, the program needs to know the expected output given it a set of inputs.
  2. Each one of the synapse has a weight. In the beginning, this weight are usually a random number.
  3. When the program starts to analysis the outputs given by the algorithm, it will be compared to the expect value to the expect output.
  4. The training function will perform little adjustments on the weights to compensate the error of the output.
  5. Repeat it a lot of times, and then your neural network is trained.

The entrusting thing here, is that the knowledge of the training remains just on the weights of the synapse.

But how does the algorithm know how much weight needs to be adjusted?

Well… of course we have a formula for that, and it has a name the Error Weighted Derivative formula

If you do a research about this you will find some very scary shit, really…

But I want to keep it simple… So I did a summary of the information from the Milo Spencer-Harper article.

The final formula is give by:

  1. Adjust weight, is the value that needs to be add to the actual weight of the synapse.
  2. The error is given by the difference between expect result and actual result, so error=target_output-actual_output
  3. The input, is the actual input value. It could be 1 or 0 . If the input is 0 so all the formula will cancel it self, if is 1 so the formula can be calculate and generate adjustment.
  4. Now the last part is the gradient of the sigmoid curve, and this is give by

(1 minus the current value of the output) multiplying by the current value of the output.

let’s start to coding…

This is going to be a simple coding, we don’t need a fancy interface. So we going to create a .net core console application using the following command.

dotnet new console

I create a git repository to hold the code, you can access through this link

And here is our final code for a simple neural network in C#, and you can clone the repository from GitHub and run the code running the follow command inside the cloned folder.

dotnet run

Results

To test, I used the same set of data used by Milo Spencer-Harper in his article.

On the table bellow. You can notice that when the value of the first input column is 1, the output is also 1.

So we know that the answer for the last row is 1.

If we train our algorithm with this table, and then ask for the answer of the last row the value should be 1 or a value very next to 1.

And we did it, after a lot of effort to wrote the code in C# without the support of the numpy we really did it.

Final thoughts

Without the use of any auxiliary library is a kind a confuse thing to do, with all the matrix multiplication and transpose matrix.

The use of python with the numpy library here would save us at least half of the effort.

Is possible to implement a neural network, with no support of the libraries?

Yes, It is. But worth the effort? Well… I would say no. I needed 4 times more lines of code to achieve the thing that python does in the Milo Spencer-Harper article. And this was counting the comments…

And one last thing, take care with the parentheses when you are doing a calculation. I lost one day to figure out why the result of the algorithm was so wrong, then I found out that I forgot a pair of parentheses on the sigmoid function.

--

--