The Machine Learning Crash Course – Part 2: Linear Regression

Welcome back to the second part of the Machine Learning Crash Course…🌟🌟🌟🌟🌟

In the first part we’ve covered the basic terminologies of Machine Learning and have taken a first look at Colab – a Python-based development environment which is great for solving Machine Learning exercises with Python and TensorFlow.

In this second part we’ll move on and start with the first practical machine learning scenario which is solving a simple linear regression problem. First, let’s clarify what linear regression is in general.

Linear Regression

The first Machine Learning exercise we’re going to solve is a simple linear regression task. Linear regression is a linear approach to modelling the relationship between a dependent variable and one or more independent variables. If only one independent variable is used we’re talking about a simple linear regression. A simple linear regression is what we’ll be using for the Machine Learning exercise in this tutorial:

y = 2x + 30

In this example x is the independant variable and y is the dependant variable. For every input value of x the corresponding output value y can be determined.

Create A New Colab Notebook And ImporT Dependencies

To get started let’s create a new Python 3 Colab notebook first. Go to https://colab.research.google.com login with your Google account and create a new notebook which is initially empty.

As the first step we need to make sure to import needed libraries. We’ll use TensorFlow, NumPy and Matplotlib. Insert the following lines of code in code cell:

from __future__ import absolute_import, division, print_function, unicode_literals
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt

The first line of imports is just added because of compatibility reasons and can be ignored. The import statements for TensorFlow, NumPy and Matplotlib are working out of the box because all three libraries are preinstalled in the Colab environment.

Preparing The Training Data

Having imported the needed libraries the next step is to prepare the training data which should be used to train our model. Insert another code cell and insert the following Python code:

values_x = np.array([-10, 0, 2, 6, 12, 15], dtype=float)
values_y = np.array([10, 30, 34, 42, 54, 60], dtype=float)

for i,x in enumerate(values_x):
  print("X: {} Y: {}".format(x, values_y[i]))

Two NumPy arrays are initialised here. The first array (values_x) is containing the x values of our linear regression. This is the independent variable of y = 2x + 30. For each of the x values in the first array the second array (values_y) contains the corresponding y value.

By using a for-loop the value pairs are printed out:

If you like you can also use Matplotlib to visualise the the linear regression function as a graph:

x = np.linspace(-10,10,100) 
plt.title('Graph of y=2x+30') 
plt.plot(x, x*2+30);

Creating The Model

Next, we’re ready to create the model (neural network) which we need to solve our linear regression task. Insert the following code into the notebook:

model = tf.keras.Sequential([
    tf.keras.layers.Dense(units=1, input_shape=[1])
])

model.compile(loss='mean_squared_error', 
              optimizer=tf.keras.optimizers.Adam(0.1))

Here we’re using the TensorFlow-integrated Keras API to create out neural network. In order to create a new sequential model the tf.keras.Sequential method is used.

Note:

Keras is a high-level interface for neural networks that runs on top of different back-ends. Its API is user-friendly, yet flexible enough to build all kinds of applications. Keras quickly gained traction after its introduction and in 2017, the Keras API was integrated into core Tensorflow as tf.keras

In Keras, you assemble layers to build models. A model is (usually) a graph of layers. The most common type of model is a stack of layers: the* tf.keras.Sequential* model.

The call of the Sequential method is expecting to get an array (stack) of layers. In our case it is just one layer of type Dense. A Dense layer can be seen as a linear operation in which every input is connected to every output by a weight and a bias. The number of inputs is specified by the first parameter units. The number of neurons in the layer is determined by the value of the parameter input_shape.

In our case we only need one input element because for the linear regression problem we’re trying to solve by the neural network we’ve only defined one dependant variable (x). Furthermore the Dense layer is setup in the most simple way: it consists of just one neuron:

With that simple neural network defined it’s easy to take a look at some of the insights to further understand how the neurons work. Each neuron has a specific weight which is adapted when the training is performed. The weight of every neuron in the fully connected Dense layer is multiplied with each input variable. As we only have defined one input variable (x) this input is multiplied with the weight w1 of the first and only neuron of the defined Dense layer. Furthermore, for each Dense layer a bias (b1) is added to the formula:

Now we can see why it is sufficient to only ad a Dense layer with just one neuron to solve our simple linear regression problem. By training the model the weight of the neuron will approach a value of 2 and the bias will approach a value of 30. The trained neuron will then be able to provide the output for inputs of x.

Having added the Dense layer to the sequential model we finally need to compile the model in order to make it usable for training and prediction in the next step.

The compilation of the model is done by executing the method model.compile:

model.compile(loss='mean_squared_error', 
              optimizer=tf.keras.optimizers.Adam(0.1))

Here we need to specify which loss function and which type of optimizer to use.

Loss function:

Loss functions are a central concept in machine learning. By using loss functions the machine learning algorithm is able to measure how much a prediction deviates from the actual result. Based on that determination the machine algorithm knows if the prediction results are getting better or worse.

The mean squred error is a specific loss function which is suitable to train a model for a linear regession problem.

As the name suggests, Mean square error is measured as the average of squared difference between predictions and actual observations. Due to squaring, predictions which are far away from actual values are penalized heavily in comparison to less deviated predictions.

Optimizer:

Based on the outcome which is calculated by the loss function the optimizer is used to determine the learning rate which is applied for the parameters in the model (weights and biases).

In our example we’re making use of the Adam optimizer which is great for linear regression tasks.

Training The Model

The model is ready and the next thing we need to do is to train the model with the test data. This is being done by using the model.fit method:

history = model.fit(values_x, values_y, epochs=500, verbose=False)

As the first and the second argument we’re passing in the test values which are available in arrays values_x and values_y. The third argument is the number of epochs which will be used for training. An epoch is an iteration over the entire x and y data provided. In our example we’re using 500 iterations over the test data set to train the model.

After executing the training of the model let’s take a look inside the development of the loss over all 500 epochs. This can be printed out as a diagram by using the following three lines of code:

plt.xlabel("Epoch Number") 
plt.ylabel("Loss Magnidute") 
plt.plot(history.history['loss'])

The result should be a diagram that looks like the following:

Here you can see that the loss gets better and better from epoch to epoch. Over the 500 epochs used for training we’re able to see that the loss magnitude is approaching zero which shows that the model is able to predict values with a high accuracy.

Predicting Values

Now that the model is fully trained let’s try to perform a prediction by calling function model.predict.

print(model.predict([20.0]))

The argument which is passed into the predict method is an array containing the *x *value for which the corresponding *y *value should be determined. The expected result should be somewhere near 70 (because of y=2x+30) . The output can be seen in the following:

Here we’re getting returned the value 70.05354 which is pretty close to 70.0, so that our model is working as expected.

Getting Model Insights, Retrieving Weights And Bias

We’re able to get more model insights by taking a look at the weight and the bias which is determined for the first layer:

print("These are the layer variables: {}".format(model.layers[0].get_weights()))

As expected we’re getting returned two parameter for our first and only layer in the model:

The two parameters corresponds to the two variables we have in the model:

Weight
Bias

For the weight the value which is determined is near the target value of 2 and for the bias the value which is determined is near the target value of 30 (according to our linear regression formula: y = 2x + 30).

#machine-learning