Vanilla Neural Networks in R

Vanilla Neural Networks in R

Take a Look Under The Hood of Neural Network Architecture: Design and Build a Neural Network, from Scratch, in R, without using any Deep Learning Frameworks or Packages

Take a Look Under The Hood of Neural Network Architecture: Design and Build a Neural Network, from Scratch, in R, without using any Deep Learning Frameworks or Packages

Contents:

  1. Introduction

  2. Background

  3. Semantics

  4. Set Up

  5. Get Data

  6. Check Data

  7. Prepare the Data

  8. Instantiate the Network

  9. Initialise the Network

  10. Forward Propagation

  11. Calculate the Cost

  12. Backward Propagation

  13. Update Model Parameters

  14. Run the Model End-to-End

  15. Create Prediction

  16. Conclusion

  17. Post Script


1. Introduction

Modern-day Data Science techniques frequently use robust frameworks for designing and building machine learning solutions. In the Rcommunity, packages such as thetidyverseand thecaretpackages are frequently referenced; and withinPython, packages such as numpy, pandas, sci-kit learnare frequently referenced. There are even some packages that have been built to be used in either language, such askeras, pytorch, tensorflow. However, the limitation of using these packages is the ‘_blackbox_’phenomenon, where users do not understand what is happening behind-the-scenes (or ‘under the hood’, if you will). Users know how to use the functions, and can interpret the results, but don’t necessarily know how the packages were able to achieve the results.

The purpose of this paper is to create a ‘back to basics_’ approach to designing Deep Learning solutions. The intention is not to create the most predictive model, nor is it to use the latest and greatest techniques (such as convolution or recursion); but the intention is to create a _basic _neural network, from _scratch, using no _frameworks, and to _walk through the methodology.

Note: The word ‘_Vanilla_’ in ‘Vanilla Neural Networks’ simply refers to the fact it is built from scratch, and does not use any pre-existing frameworks in its construction.

2. Background

2.1. Context

There are already many websites and blogs which explain how this process is done. Such as Jason Brownlee’s article How to Code a Neural Network with Backpropagation In Python (from scratch), and DeepLearning.ai’s notebook dnn_app_utils_v2.py (on the Microsoft Azure Notebooks network). However, these sources are all written in Python. Which is fine, if that’s what is needed, and there are some very legitimate reasons to use Python over other languages. But this paper will be written in R.

The R language was chosen for two reasons:

  1. I am familiar with the language. I can speak Python (along with other languages); I just chose R to show how it can be achieved using this language.
  2. To prove that there are many different ways to achieve the same outcome. So, while there are sometimes legitimate constraints for choosing one language over another (business legacy, technological availability, system performance, etc), sometimes one language is chosen simply because it is stylistically more preferable.

Therefore, let’s see how to architect and construct a Vanilla Neural Network in R.

2.2. What It’s Not

This article does not cover the _latest _and _greatest _Deep Learning architectures (like Convolution or Recursion). As such the final performance may not be as good as it _could _be, if these other architectures were used.

This article does not teach readers about the theoretical mathematical concepts behind how Neural Networks operate. There are plenty of other lectures which teach this information (eg. The Math Behind Neural Networks). In fact, this article assumes a lot of knowledge from the Reader about programming, about calculus, and about the basics behind what a Neural Network conceptually is.

This article does not cover _why _Neural Networks work the way they do, and the conceptual understanding behind a Feed-Forward architecture. There are plenty of other blogs (eg. A Beginner Intro to Neural Networks) and videos (eg. Neural Networks series) which cover such information.

This article doesn’tpoint the reader to other packages and applications which may already have this information set up and working. Packages like tensorflowandnnet already have this covered.

What this article _actually _is, is a functional walk-through, how-to piece, for creating a Vanilla Neural Network (a Feed-Forward Network), from scratch, step-by-step, in the R programming language. It contains lots of code, and lots of technical details.

3. Semantics

3.1. Layout

This article is laid our in such a way to describe how a Neural Network is built from the ground-up. It will walk through the steps to:

  1. Access and check the data
  2. Instantiate and Initialise the network
  3. Run forward propagation
  4. Compute the cost
  5. Run backward propagation
  6. Update the model
  7. Set up a training method to loop through every thing
  8. Predict and assess the performance

In the interest of brevity, the functions defined here will not include all the commentary and validations that should be included in a typical function. They will only include basic steps and prompts. However, the source code for this article (located here) does contain all the appropriate function docstrings and assertions.

3.2. Syntax

For the most part, the syntax in this article is kept to the dplyr‘pipe’ method (which uses the%>%symbol). However, in certain sections the Rbase syntax is used (for example, in function declaration lines).

Throughout the article, many custom functions are written. Each of these are prefixed with the words get, let and set. The definitions of each are given below.

  • get_*():
  • — It will get certain attributes of meta-data from the objects which are parsed to this function.
  • — Or will use the information parsed to this function to derive and get other values or parameters.
  • set_*():
  • — It will set (or ‘update’) the objects which are parsed to this function.
  • — Is usually used for updating the network during forward and backward propagation processes.
  • let_*():
  • — Similar to get, in that it takes other values parsed to this function to derive an outcome, however it will let this value be utilised by another object or function.
  • — Used mainly for the initialisation and activation functions.

4. Set Up

4.1. Load Packages

The first step is to import the relevant packages. This list includes the the main packages used throughout this process; and the main purpose of which is also listed.

Note what is listed above about not using existing Deep Learning packages, and yet the tensorflow package is included. Why? Well, this is used for only accessing the data, which will be covered in the next section. The tensorflow package is not used for building and training any networks.

library(tensorflow)  #<-- Only used for getting the data
library(tidyverse)   #<-- Used for accessing various tools
library(magrittr)    #<-- Extends the `dplyr` syntax
library(grDevices)   #<-- For plotting the images
library(assertthat)  #<-- Function assertions
library(roxygen2)    #<-- Documentation is important
library(caret)       #<-- Doing data partitioning
library(stringi)     #<-- Some string manipulation parts
library(DescTools)   #<-- To properly check `is.integer`
library(tictoc)      #<-- Time how long different processes take
library(docstring)   #<-- Makes viewing the documentation easier
library(roperators)  #<-- Conveniently using functions like %+=%
library(plotROC)     #<-- For plotting predictions

5. Get Data

5.1. Download Data

The dataset to be used is the CIFAR-10 dataset. It’s chosen for a number of reasons, including:

  1. The data is on images, which is ideal for Deep Learning purposes;
  2. There are a decent number of images included (60,000 images in total);
  3. All images are the same the same size (32x32 pixels);
  4. The images have been categorised in to 10 different classes; and
  5. It’s easily accessible through the TensorFlow package.

The following code chunk has the following process steps:

  1. Get the data
  2. — In order to import the date, it is accessed through the keras element, which contains the suite of datasets, including the cifar10 part.
  3. — The load_data() function retrieves the data from the online GitHub repository.
  4. Extract the second element
  5. — The load_package() here returns two different objects:
  6. — — 1. The train dataset (containing 50,000 images);
  7. — — 2. The Test dataset (containing 10,000 images).
  8. — The second element is extracted (by using the extract2(2) function) because only 10,000 images are needed for these purposes.
  9. — This article is to show the _process _of creating Vanilla Neural Networks; if more data is needed at a later stage, it can easily be accessed here.
  10. Name the parts
  11. — The data as downloaded contains two further elements:
  12. — — 1. The images themselves (in the form of a 4-Dimensional array);
  13. — — 2. The image labels (in the form of a 2-Dimensional, single column array).
  14. — This data does not have any names, so therefore the names are set by using the set_names() function.
## Download Data
## NOTE:
## - The first time you run this function, it download everything.
## - Next time you run it, TensorFlow will load from Cache.
cifar <- tf$keras$datasets$cifar10$load_data() %>% 
    extract2(2) %>% 
    set_names(c("images","classes"))

5.2. Get Class Definitions

One of the challenges behind accessing this data from the TensorFlow package is that the classes are only numeric values (0 to 9) for each type of image. The definitions for these images can be found on GitHub (GitHub > EN10 > CIFAR). These classes are defined in the following code chunk.

## Define classes
ClassList <- c(
    "0" = "airplane",
    "1" = "automobile",
    "2" = "bird",
    "3" = "cat",
    "4" = "deer",
    "5" = "dog",
    "6" = "frog",
    "7" = "horse",
    "8" = "ship",
    "9" = "truck" 
)

6. Check Data

6.1. Check Objects

It is important to check the data, to ensure that it has been generated correctly, and all the information looks okay. For this, a custom-function is written (get_ObjectAttributes()), the source code for which can be found here. As seen by the following code chunk, the images object is a 4-Dimensional numeric array, with 10,000 images, each 32x32 pixels, and 3 colour chanels. The entire object is over 117 Mb large.

## Check Images
cifar %>% 
    extract2("images") %>% 
    get_ObjectAttributes("cifar$images") %>% 
    cat()

Which prints:

Name : cifar$images
 - Size : 117.2 Mb
 - Clas : array
 - Type : integer
 - Mode : numeric
 - Dims : 10000x32x32x3

When checking the classes object, it is a 2-Dimensional numeric array (with only 1 column), but with the same number of images as the images object (which is to be expected), with the frequency of each class label having exactly 1000 images each. The total size is less than 40 Kb.

## Check classes
cifar %>% 
    extract2("classes") %>% 
    get_ObjectAttributes(name="cifar$classes", print_freq=TRUE) %>% 
    cat()

Which prints:

Name : cifar$classes
 - Size : 39.3 Kb
 - Clas : matrix,array
 - Type : integer
 - Mode : numeric
 - Dims : 10000x1
 - Freq :
      label Freq
   1  0     1000
   2  1     1000
   3  2     1000
   4  3     1000
   5  4     1000
   6  5     1000
   7  6     1000
   8  7     1000
   9  8     1000
   10 9     1000

6.2. Check Images

After having gained an appreciation of the size of the objects in memory, it is then worth while to check the actual images themselves. As humans, we understand the actual images and the colours, better than we understand the numbers.

In order to visualise the images, two custom functions are written, as shown in the following code chunk. These functions take in the data (as a 4-Dimensional array), and visualise the images as a plot.

set_MakeImage <- function(image, index=1) {

    ## Extract elements
    image.r <- image[,,1]
    image.g <- image[,,2]
    image.b <- image[,,3]

    ## Make rgb
    image.rgb <- rgb(
        image.r, 
        image.g, 
        image.b, 
        maxColorValue=255
    )

    ## Fix dimensions
    dim(image.rgb) <- dim(image.r)

    ## Return
    return(image.rgb)

}
plt_PlotImage <- function(images, classes, class_list, index=1) {

    ## Slice images
    image <- images[index,,,]
    image %<>% set_MakeImage(index)
    lbl <- classes %>% 
        extract(index) %>% 
        as.character() %>% 
        class_list[[.]]

    ## Create plot
    plot <- ggplot() + 
        ggtitle(lbl) +
        draw_image(image, interpolate=FALSE)

    ## Return
    return(plot)

}

When the function is run on the top 16 images, the following is displayed. As shown, the images are extremely pixelated (which is expected, as they are only 32x32 pixels each), and you can see how each of the images are categorised.

## Set list
lst <- list()

## Loop 16 images
for (index in 1:16) {
    lst[[index]] <- plt_PlotImage(
        cifar$images, 
        cifar$classes, 
        ClassList,
        index)
    }
## View images
plt <- gridExtra::grid.arrange(grobs=lst, ncol=4)

deep-learning r data-science machine-learning python

Bootstrap 5 Complete Course with Examples

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Building a simple Applications with Vue 3

Deno Crash Course: Explore Deno and Create a full REST API with Deno

How to Build a Real-time Chat App with Deno and WebSockets

Convert HTML to Markdown Online

HTML entity encoder decoder Online

PyTorch for Deep Learning | Data Science | Machine Learning | Python

PyTorch for Deep Learning | Data Science | Machine Learning | Python. PyTorch is a library in Python which provides tools to build deep learning models. What python does for programming PyTorch does for deep learning. Python is a very flexible language for programming and just like python, the PyTorch library provides flexible tools for deep learning.

PyTorch for Deep Learning | Data Science | Machine Learning | Python

PyTorch is a library in Python which provides tools to build deep learning models. What python does for programming PyTorch does for deep learning.

Data Science Projects | Data Science | Machine Learning | Python

Practice your skills in Data Science with Python, by learning and then trying all these hands-on, interactive projects, that I have posted for you.

Data Science Projects | Data Science | Machine Learning | Python

Practice your skills in Data Science with Python, by learning and then trying all these hands-on, interactive projects, that I have posted for you.

Data Science Projects | Data Science | Machine Learning | Python

Practice your skills in Data Science with Python, by learning and then trying all these hands-on, interactive projects, that I have posted for you.