1604113500

# Logistic Regression using Single Layer Perceptron Neural Network (SLPNN)

## Introduction

Welcome to the final Retrospective of the ML challenge which is going to cover Weeks 4 to 10. As a quick introduction, for those who’d like to follow the full 10-week journey, here’re the links to all previous posts:

· Original post about the challenge: #10WeeksOfMachineLearningFun

· Links to previous retrospectives: #Week1 #Week2 #Week3

Weeks 4–10 has now been completed and so has the challenge!

I’m very pleased for coming that far and so excited to tell you about all the things I’ve learned, but first things first: as a quick explanation as to why I’ve ending up summarising the remaining weeks altogether and so late after completing this:

• The approach I selected for Logistic regression in #Week3 (Approximate Logistic regression function using a Single Layer Perceptron Neural Network — SLPNN) took longer to unravel, both from maths as well as from coding perspective that it was practically impossible to provide updates on a weekly basis
• Also, I probably digressed a bit during that period to understand some of the maths, which was good learning overall e.g. Cost functions and their derivatives, and most importantly when to use one over another and why :) (more on that below)
• Derivative of Cost function: given my approach in #Week3, I had to make sure that the Back-propagation chain rule maths for working out the partial derivative of the Cost function with respect to the weights, tied perfectly with the maths of the analytical calculation for the same partial derivative. And in order to do so, I had to put pen on paper multiple times over and over again until it finally made sense. You can definitely trust the maths once you’ve verified them in two different ways!
• In fact, I have created a handwritten single page cheat-sheet that shows all these, which I’m planning to publish separately so stay tuned.
• Finally, a fair amount of the time, planned initially to spend on the Challenge during weeks 4–10, went to real life priorities in professional and personal life. Which is exactly what happens at work, projects, life, etc… You just have to deal with the priorities and get back to what you’re doing and finish the job! So here I am!

#perceptron #neural-networks #data-science #machine-learning #logistic-regression

1604113500

## Introduction

Welcome to the final Retrospective of the ML challenge which is going to cover Weeks 4 to 10. As a quick introduction, for those who’d like to follow the full 10-week journey, here’re the links to all previous posts:

· Original post about the challenge: #10WeeksOfMachineLearningFun

· Links to previous retrospectives: #Week1 #Week2 #Week3

Weeks 4–10 has now been completed and so has the challenge!

I’m very pleased for coming that far and so excited to tell you about all the things I’ve learned, but first things first: as a quick explanation as to why I’ve ending up summarising the remaining weeks altogether and so late after completing this:

• The approach I selected for Logistic regression in #Week3 (Approximate Logistic regression function using a Single Layer Perceptron Neural Network — SLPNN) took longer to unravel, both from maths as well as from coding perspective that it was practically impossible to provide updates on a weekly basis
• Also, I probably digressed a bit during that period to understand some of the maths, which was good learning overall e.g. Cost functions and their derivatives, and most importantly when to use one over another and why :) (more on that below)
• Derivative of Cost function: given my approach in #Week3, I had to make sure that the Back-propagation chain rule maths for working out the partial derivative of the Cost function with respect to the weights, tied perfectly with the maths of the analytical calculation for the same partial derivative. And in order to do so, I had to put pen on paper multiple times over and over again until it finally made sense. You can definitely trust the maths once you’ve verified them in two different ways!
• In fact, I have created a handwritten single page cheat-sheet that shows all these, which I’m planning to publish separately so stay tuned.
• Finally, a fair amount of the time, planned initially to spend on the Challenge during weeks 4–10, went to real life priorities in professional and personal life. Which is exactly what happens at work, projects, life, etc… You just have to deal with the priorities and get back to what you’re doing and finish the job! So here I am!

#perceptron #neural-networks #data-science #machine-learning #logistic-regression

1623135499

## No Code introduction to Neural Networks

### The simple architecture explained

Neural networks have been around for a long time, being developed in the 1960s as a way to simulate neural activity for the development of artificial intelligence systems. However, since then they have developed into a useful analytical tool often used in replace of, or in conjunction with, standard statistical models such as regression or classification as they can be used to predict or more a specific output. The main difference, and advantage, in this regard is that neural networks make no initial assumptions as to the form of the relationship or distribution that underlies the data, meaning they can be more flexible and capture non-standard and non-linear relationships between input and output variables, making them incredibly valuable in todays data rich environment.

In this sense, their use has took over the past decade or so, with the fall in costs and increase in ability of general computing power, the rise of large datasets allowing these models to be trained, and the development of frameworks such as TensforFlow and Keras that have allowed people with sufficient hardware (in some cases this is no longer even an requirement through cloud computing), the correct data and an understanding of a given coding language to implement them. This article therefore seeks to be provide a no code introduction to their architecture and how they work so that their implementation and benefits can be better understood.

Firstly, the way these models work is that there is an input layer, one or more hidden layers and an output layer, each of which are connected by layers of synaptic weights¹. The input layer (X) is used to take in scaled values of the input, usually within a standardised range of 0–1. The hidden layers (Z) are then used to define the relationship between the input and output using weights and activation functions. The output layer (Y) then transforms the results from the hidden layers into the predicted values, often also scaled to be within 0–1. The synaptic weights (W) connecting these layers are used in model training to determine the weights assigned to each input and prediction in order to get the best model fit. Visually, this is represented as:

#machine-learning #python #neural-networks #tensorflow #neural-network-algorithm #no code introduction to neural networks

1595548920

## Comparison between Logistic Regression and Neural networks

I recently learned about logistic regression and feed forward neural networks and how either of them can be used for classification. What bugged me was what was the difference and why and when do we prefer one over the other. So, I decided to do a comparison between the two techniques of classification theoretically as well as by trying to solve the problem of classifying digits from the MNIST dataset using both the methods. In this article, I will try to present this comparison and I hope this might be useful for people trying their hands in Machine Learning.

# Source

The code that I will be using in this article are the ones used in the tutorials by Jovian.ml and freeCodeCamp on YouTube. The link has been provided in the references below.

# Problem Statement

Given a handwritten digit, the model should be able to tell whether the digit is a 0,1,2,3,4,5,6,7,8 or 9.

We will use the MNIST database which provides a large database of handwritten digits to train and test our model and eventually our model will be able to classify any handwritten digit as 0,1,2,3,4,5,6,7,8 or 9. It consists of 28px by 28px grayscale images of handwritten digits (0 to 9), along with labels for each image indicating which digit it represents. We will learn how to use this dataset, fetch all the data once we look at the code.

Let us have a look at a few samples from the MNIST dataset.

Examples of handwritten digits from the MNIST dataset

Now that we have a clear idea about the problem statement and the data-source we are going to use, let’s look at the fundamental concepts using which we will attempt to classify the digits.

You can ignore these basics and jump straight to the code if you are already aware of the fundamentals of logistic regression and feed forward neural networks.

# Logistic Regression

I will not talk about the math at all, you can have a look at the explanation of Logistic Regression provided by Wikipedia to get the essence of the mathematics behind it.

So, Logistic Regression is basically used for classifying objects. It predicts the probability(P(Y=1|X)) of the target variable based on a set of parameters that has been provided to it as input. For example, say you need to say whether an image is of a cat or a dog, then if we model the Logistic Regression to produce the probability of the image being a cat, then if the output provided by the Logistic Regression is close to 1 then essentially it means that Logistic Regression is telling that the image that has been provided to it is that of a cat and if the result is closer to 0, then the prediction is that of a dog.

It is called Logistic Regression because it used the logistic function which is basically a sigmoid function. A sigmoid function takes in a value and produces a value between 0 and 1. Why is this useful ? Because probabilities lie within 0 to 1, hence sigmoid function helps us in producing a probability of the target value for a given input.

The sigmoid/logistic function looks like:

where e is the exponent and **_t _**is the input value to the exponent.

Generally **_t _**is a linear combination of many variables and can be represented as :

The standard logistic function:

NOTE: Logistic Regression is simply a linear method where the predictions produced are passed through the non-linear sigmoid function which essentially renders the predictions independent of the linear combination of inputs.

# Neural networks

Artificial Neural Networks are essentially the mimic of the actual neural networks which drive every living organism. They are currently being used for variety of purposes like classification, prediction etc. In Machine Learning terms, why do we have such a craze for Neural Networks ? Because they can approximate any complex function and the proof to this is provided.

What does a neural network look like ? Like this:

ANN with 1 hidden layer, each circle is a neuron. Source: Wikipedia

That picture you see above, we will essentially be implementing that soon. Now, what you see in that image is called a neural network architecture, you can make your own architecture by defining more than one hidden layers, add more number of neurons to the hidden layers etc. Now, there are some different kind of architectures of neural networks currently being used by researchers like Feed Forward Neural NetworksConvolutional Neural NetworksRecurrent Neural Networks etc.

In this article we will be using the Feed Forward Neural Network as its simple to understand for people like me who are just getting into the field of machine learning.

# Feed Forward Neural Networks

Let us talk about perceptron a bit. This is a neural network unit created by Frank Rosenblatt in 1957 which can tell you to which class an input belongs to. It is a type of linear classifier.

What do you mean by linearly separable data ?

As you can see in image A that with one single line( which can be represented by a linear equation) we can separate the blue and green dots, hence this data is called linearly classifiable.

And what does a non-linearly separable data look like ?

Like the one in **image B. **We can see that the red and green dots cannot be separated by a single line but a function representing a circle is needed to separate them. As the separation cannot be done by a linear function, this is a non-linearly separable data.

Why do we need to know about linear/non-linear separable data ?

Because a single perceptron which looks like the diagram below is only capable of classifying linearly separable data, so we need feed forward networks which is also known as the multi-layer perceptron and is capable of learning non-linear functions.

A single layer perceptron and-multi-layer-perceptrons-the-artificial-neuron-at-the-core-of-deep-learning/)

A Feed forward neural network/ multi layer perceptron:

A multi layer perceptron(Feed Forward Neural Network). Source: missinglink.ai

I get all of this, but how does the network learn to classify ?

Well we must be thinking of this now, so how these networks learn comes from the perceptron learning rule which states that a perceptron will learn the relation between the input parameters and the target variable by playing around (adjusting ) the weights which is associated with each input. If the weighted sum of the inputs crosses a particular thereshold which is custom, then the neuron produces a true else it produces a false value. Now, when we combine a number of perceptrons thereby forming the Feed forward neural network, then each neuron produces a value and all perceptrons together are able to produce an output used for classification.

Now that was a lot of theory and concepts !

I have tried to shorten and simplify the most fundamental concepts, if you are still unclear, that’s perfectly fine. I am sure your doubts will get answered once we start the code walk-through as looking at each of these concepts in action shall help you to understand what’s really going on. I have also provided the references which have helped me understand the concepts to write this article, please go through them for further understanding.

Let’s start the most interesting part, the code walk-through!

# Importing Libraries

We will be working with the MNIST dataset for this article. torchvision library provides a number of utilities for playing around with image data and we will be using some of them as we go along in our code.

#logistic-regression #artificial-intelligence #machine-learning #classification-algorithms #artificial-neural-network #algorithms

1602954000

## Perceptron to Multi-layered Feedforward Neural Network

This article will help you understand the transition of AI from classical machine learning to deep learning, starting from the basics of machine learning with its major - supervised and unsupervised learning to different regularization and optimization techniques.

The goal of this article is to introduce the major concepts of machine learning-what is machine learning? How does a machine learn by itself to do a particular task? How does it choose the essential features of the data that strongly contribute to the prediction of future events? How can we understand whether a machine has succeeded or failed in that?

In this article, we will discuss the foundations of artificial neural networks starting from perceptron to multi-layered feedforward neural networks. This article discusses the various stages of how to apply different transformation techniques for data preparation, how to train a neural network, and then validate and deploy a neural network for solving real-world problems.

· Understanding Machine Learning and Artificial Neural Network

· Feedforward Neural Network & Backpropagation Algorithm

· Evaluating and Tuning the Artificial Neural Network

· Classical Machine Learning vs Deep Learning

## Understanding Machine Learning and Artificial Neural Network

This section starts with a brief overview of what is machine learning, its major types- supervised and unsupervised learning. Then we will understand the very evolution of artificial neural networks starting with how a biological neuron works. We’ll also discuss the design of artificial neurons with an understanding of deep neural networks with activation functions.

### What is Machine Learning?

The term, Machine learning, has become a buzzword nowadays which refers to the ability of a machine to learn from the data without the help of the set of rules that are defined explicitly as like in the traditional rule-based algorithms. So definitely, if it learns from the data without any need for an explicit declaration of rules then it has to do with the experience from learning.

Our way of learning always follows a curve of failures although it’s perfectly descendent, lastly it will converge to the extent of our hard work

In the last decade of technology, machine learning techniques have become the common tools to automate the tasks that would have required huge efforts with the traditional rule-based algorithms.

In the** traditional rule-based algorithms**, the set of rules used to be defined to work on with a specific variety of data and could not be generalized to a large extent of data because of its specificity of working on only particular data. For example, if YouTube, a video sharing site decides to perform a copyright check on videos that are being uploaded on its server with a human operator, it will need a lot many people to execute this task of copyright check. But if YouTube chooses to do this with the help of some video processing algorithm then the task of copyright check would be easier but not robust as video processing algorithm possibly would work only on a set of videos that don’t have any kind of transformations like flip, rotate, crop, blur, etc. And it’s quite difficult to write a separate algorithm for individual transformation so the solution to this problem can be machine learning. In this case, a learning model is built by getting trained on data and identifying implicit features that uniquely signifies the data with which new data can be validated automatically.

Today, we are living in the era of machine-learning-based technologies; email services learn how to classify the emails into spam and ham; search engines learn what to recommend to the user based on their search history; banking systems are now able to sanction loans based on the creditworthiness of a customer. Prediction of heart disease based on clinical data, identifying voice commands, and forecasting annual rainfall are other significant tasks that machine learning facilitates.

One common problem with all of these applications is that a programmer cannot explicitly define the set of instructions for the task that needs to be performed due to the underlying complexity of the data; this was machine learning helps. It has made itself useful across industries like retail, banking, healthcare to the automobile industry for its ability to predict future events with significant accuracy.

_In machine learning algorithms, the __input is the experience in form of data _and output is knowledge or wisdom gained with inductive inference which in turn helps to predict future events, so rather, machine learning is an art of experiential learning.

Let us start with a real-life experience of preparing a food dish with some cooking recipe, how do we prepare the food, let’s go through the process of making the delicious food, at first we collect all the ingredients that are needed for food preparation, as a naïve person in the cooking, we follow a cooking recipe which involves set of steps that need to be performed. Let us take an example of a famous dish of western India, poha, which needs many ingredients like beaten rice flakes, mustard, curry leaves, groundnuts, oil, salt, and others. Assume now, with all ingredients, we start making the poha as per the directions in the cooking recipe.

#artificial-intelligence #machine-learning #artificial-neural-network #neural-networks #deep-learning

1594312560

## Autonomous Driving Network (ADN) On Its Way

Talking about inspiration in the networking industry, nothing more than Autonomous Driving Network (ADN). You may hear about this and wondering what this is about, and does it have anything to do with autonomous driving vehicles? Your guess is right; the ADN concept is derived from or inspired by the rapid development of the autonomous driving car in recent years.

Driverless Car of the Future, the advertisement for “America’s Electric Light and Power Companies,” Saturday Evening Post, the 1950s.

The vision of autonomous driving has been around for more than 70 years. But engineers continuously make attempts to achieve the idea without too much success. The concept stayed as a fiction for a long time. In 2004, the US Defense Advanced Research Projects Administration (DARPA) organized the Grand Challenge for autonomous vehicles for teams to compete for the grand prize of \$1 million. I remembered watching TV and saw those competing vehicles, behaved like driven by drunk man, had a really tough time to drive by itself. I thought that autonomous driving vision would still have a long way to go. To my surprise, the next year, 2005, Stanford University’s vehicles autonomously drove 131 miles in California’s Mojave desert without a scratch and took the \$1 million Grand Challenge prize. How was that possible? Later I learned that the secret ingredient to make this possible was using the latest ML (Machine Learning) enabled AI (Artificial Intelligent ) technology.

Since then, AI technologies advanced rapidly and been implemented in all verticals. Around the 2016 time frame, the concept of Autonomous Driving Network started to emerge by combining AI and network to achieve network operational autonomy. The automation concept is nothing new in the networking industry; network operations are continually being automated here and there. But this time, ADN is beyond automating mundane tasks; it reaches a whole new level. With the help of AI technologies and other critical ingredients advancement like SDN (Software Defined Network), autonomous networking has a great chance from a vision to future reality.

In this article, we will examine some critical components of the ADN, current landscape, and factors that are important for ADN to be a success.

# The Vision

At the current stage, there are different terminologies to describe ADN vision by various organizations.

Even though slightly different terminologies, the industry is moving towards some common terms and consensus called autonomous networks, e.g. TMF, ETSI, ITU-T, GSMA. The core vision includes business and network aspects. The autonomous network delivers the “hyper-loop” from business requirements all the way to network and device layers.

On the network layer, it contains the below critical aspects:

• Intent-Driven: Understand the operator’s business intent and automatically translate it into necessary network operations. The operation can be a one-time operation like disconnect a connection service or continuous operations like maintaining a specified SLA (Service Level Agreement) at the all-time.
• **Self-Discover: **Automatically discover hardware/software changes in the network and populate the changes to the necessary subsystems to maintain always-sync state.
• **Self-Config/Self-Organize: **Whenever network changes happen, automatically configure corresponding hardware/software parameters such that the network is at the pre-defined target states.
• **Self-Monitor: **Constantly monitor networks/services operation states and health conditions automatically.
• Auto-Detect: Detect network faults, abnormalities, and intrusions automatically.
• **Self-Diagnose: **Automatically conduct an inference process to figure out the root causes of issues.
• **Self-Healing: **Automatically take necessary actions to address issues and bring the networks/services back to the desired state.
• **Self-Report: **Automatically communicate with its environment and exchange necessary information.
• Automated common operational scenarios: Automatically perform operations like network planning, customer and service onboarding, network change management.

On top of those, these capabilities need to be across multiple services, multiple domains, and the entire lifecycle(TMF, 2019).

No doubt, this is the most ambitious goal that the networking industry has ever aimed at. It has been described as the “end-state” and“ultimate goal” of networking evolution. This is not just a vision on PPT, the networking industry already on the move toward the goal.

David Wang, Huawei’s Executive Director of the Board and President of Products & Solutions, said in his 2018 Ultra-Broadband Forum(UBBF) keynote speech. (David W. 2018):

“In a fully connected and intelligent era, autonomous driving is becoming a reality. Industries like automotive, aerospace, and manufacturing are modernizing and renewing themselves by introducing autonomous technologies. However, the telecom sector is facing a major structural problem: Networks are growing year by year, but OPEX is growing faster than revenue. What’s more, it takes 100 times more effort for telecom operators to maintain their networks than OTT players. Therefore, it’s imperative that telecom operators build autonomous driving networks.”

Juniper CEO Rami Rahim said in his keynote at the company’s virtual AI event: (CRN, 2020)

“The goal now is a self-driving network. The call to action is to embrace the change. We can all benefit from putting more time into higher-layer activities, like keeping distributors out of the business. The future, I truly believe, is about getting the network out of the way. It is time for the infrastructure to take a back seat to the self-driving network.”

# Is This Vision Achievable?

If you asked me this question 15 years ago, my answer would be “no chance” as I could not imagine an autonomous driving vehicle was possible then. But now, the vision is not far-fetch anymore not only because of ML/AI technology rapid advancement but other key building blocks are made significant progress, just name a few key building blocks:

• software-defined networking (SDN) control
• industry-standard models and open APIs
• Real-time analytics/telemetry
• big data processing
• cross-domain orchestration
• programmable infrastructure
• cloud-native virtualized network functions (VNF)
• DevOps agile development process