Artificial Intelligence Beginnings

Description

A Neural Network is one of the most commonly used computational models to apply Artificial Intelligence to real-world problems. It consists of sets of units, called neurons, connected to each other to transmit and process signals.

Each neuron is connected to others by links. Through them, they receive information, evaluate it, and spread the outcome of that evaluation, which can be either a signal to highlight a feature or to attenuate it.

Objective

This post will explain how to create a Neural Network from scratch, using just the Python language, and how to use it to examine cars and predict their mileage per gallon.

First, there are some concepts that should be explained:

Structure

Commonly, Neural Networks are implemented in the form of several neuron layers. The first layer, called the input layer or layer 0, represents the different object features that are evaluated. It differentiates from all the other layers as it doesn’t perform any calculations, just represents the data fed to the network.

In our example, the first layer will hold the features of the cars that will be examined. For example the number of cylinders, weight, acceleration, year and manufacturing origin, etc.

Then, the model will contain two hidden layers. They are called hidden because they don’t connect directly to the inputs or outputs. They will be responsible for evaluating the relationship between the different characteristics from the previous layer and issuing a result to the next layer.

Finally, we will have an output layer, which will obtain the last hidden layer results and will calculate the distance that the car can travel with a gallon of fuel.

The following image is an example of the connections between the neurons of the different layers:

Neurons

Each layer will be formed by several neurons, except the last layer, the output layer, which in this example will have only one neuron.

Each of the network’s neurons will be represented by the following formula:

Image by author

Where X is a vector (one-dimensional matrix) with the information coming from the neurons of the previous layer. In the case of the first hidden layer, it will just receive the features of the car from the input layer.

W is a vector that will assign a weight at each value from the neurons of the previous layer. They are one of the main components that will be optimized for the network to produce real results.

b is a value that will apply an offset (BIAS) to the result and will also be optimized.

Non-linear Activation

The latter function is a linear one. If all the neurons were linear functions, the result of the neural network would also be another linear function. This would not allow the network to identify complex relationships that exist between the features.

To solve this, a non-linear component is added to each neuron, called the activation function. In our case, we will use a commonly used activation function called ReLU (from Rectified Linear Unit), which is nothing more than:

Image by author

So the complete formula of each neuron, except for that of the output layer, will be:

Final Linear Activation

As the goal is to predict a value (regression), instead of detecting or classifying an object (classification), the neuron of the output layer, must generate a real value. So it will not have a non-linear activation applied.

#beginner #artificial-intelligence #artificial