Machine Learning Tutorial: What is Machine Learning?

Seems like you would have stumbled upon the term machine learning and must be wondering what exactly it is. Well, this machine learning tutorial will clear out all of your confusion!

Machine learning is a field of artificial intelligence with the help of which you can perform magic! Yes, you read it right. Let’s take some real-life examples to understand this. I believe all of you must have heard of Google’s self-driving car. A car which drives by itself without any human support; that is just amazing, isn’t it?

Self-Driving-Car-Intellipaat

Now, how about virtual personal assistants such as Apple’s Siri or Microsoft’s Cortana? If you ask Siri what is the distance between Earth and Moon, it will immediately reply that the distance is 384,400km.

Virtual-Assistants-Intellipaat

You also must have used Google maps. If you want to go from New Jersey to New York via road, google maps will show you the distance between these two places, the shortest route and also how much traffic is there along the road.

Map-Intellipaat

Now, you would agree with me that all of these are some magical applications, and the magic behind these applications is machine learning. So, simply put, machine learning is a sub-domain of artificial intelligence, where a machine is provided data to learn and make insightful decisions.

Now, that we have understood what is machine learning, let’s go ahead in this machine learning tutorial and look at the types of machine learning algorithms:

Supervised Learning

Unsupervised Learning

Semi-supervised Learning

Re-inforcement Learning

Now, let’s go ahead and understand each of these machine learning algorithms comprehensively.

Machine Learning Tutorial: Supervised Learning

In supervised learning the machine learns from data which is labelled i.e. the result for the input data is already known or in other words you can say that there is an input variable and an output variable in supervised learning and we have to map a function between the input and the output. Here the input variable is known as independent variable and the output variable is known as dependent variable.

Let’s take this example to understand supervised learning in a better way.

So, this is an apple, isn’t it? Now, how do you know, it’s an apple? Well, as a kid, you would have come across an apple and you were told that it’s an apple and your brain learnt that anything which looks like that is an apple.

Now, let’s apply the same analogy to a machine. Let’s say we feed in different images of apples to the machine and all of these images have the label “apple” associated with them.

Similarly, we will feed in different images of oranges to the machine and all of these images would have the label “orange” associated with them. So, here we are feeding in input data to the machine which is labelled.

So, this part in supervised learning, where the machine learns all the features of the input data along with it’s labels is known as ‘training’.

Machine Learning Tutorial Video:

Once, the training is done, it will be fed new data or test data to determine, how well the training has been done.

So, here, if we feed in this new image of orange to the machine without it’s label, the machine should be able to predict the correct label based on all of its training.

This is the concept of supervised learning, where we train the machine using labelled data and then use this training to find new insights.

Regression

Classification

Moving on in this machine learning tutorial, we will understand these two comprehensively.

Regression

Since Regression is a supervised learning algorithm, there will be an input variable as well as an output variable and the point to keep in mind is that the output variable is a continuous numerical, i.e. the dependent variable is a continuous numerical.

Watch this complete Machine Learning Tutorial Video: https://www.youtube.com/watch?v=4gqZLajDWh8&feature=youtu.be

Let’s take this example to understand regression:

Let’s say you have two variables, “Number of hours studied” & “Number of marks scored”. Here we want to understand how does the number of marks scored by a student change with number of hours studied by the student, i.e. “Marks scored” is the dependent variable and “Hours studied” is the independent variable.

student-Intellipaat

Now, based on this data, I want to know for how many hours should a student study to score exactly 60 marks. So, this is where regression techniques come in. The regression model would understand that there is an increment of 10 marks for every extra hour studied and to score 60 marks the student has to study for 6 hours.

You need to note that “marks scored” is the dependent variable and it is a continuous numerical.

So, this is how regression algorithms work. Now, let’s move onto the next type of supervised learning algorithms which are classification algorithms.

Classification

Classification algorithms also need both the input data as well as the output data. Here, the output variable or the dependent variable should be categorical in nature.

Let’s take this example to understand classification.

Consider these three variables, “Person has lung cancer or not”, “Weight of the person”, “Number of cigarettes smoked in a day”. Here, we want to understand does the person have lung cancer based on the weight of the person and the number of cigarettes he/she smokes in a day, i.e. “Having lung cancer” is the dependent variable and “weight” and “No of cigarettes smoked” are the independent variables.

cancer-Intellipaat

Again, you need to note here that “Having lung cancer” is a categorical variable, which has two categories, “yes” and “No”. Based on the independent variables, we classify whether the person has lung cancer or not.

Now, there are a variety of classification algorithms available such as:

Decision Tree

Random Forest

Naïve Bayes

Support Vector Machine

Let’s go ahead and understand one of these algorithms -> “Decision Tree”.

Decision Tree Classifier

Decision tree is a very popular machine learning classifier. So, a decision tree as the name states, has an inverted tree like structure. The top most node in the tree is known as the root node and the nodes at the bottom of the tree are known as the leaf nodes. Every node has a test condition and based on that test condition, the tree splits into either it’s left child or right child.

Let’s go through this example on decision tree. Here, we are trying to determine whether a person would watch the movie “Avengers” based on a series of test conditions.

Decision-Tree-Intellipaat

Here, the test condition on the root node is “likes action films”, so, if it evaluates to true, you go to the left child, else to the right child. Now, if you actually do like action films, then on the left child, there is another test condition, “Movie length greater than 2 hours”, so, if this evaluates to true, you go again go the left child, i.e. you are fine watching a movie which is greater than 2 hours. Again, when you go to the left child, there is another test condition, “Likes Robert Downey Jr”, and if this evaluates to true, it means that the person is interested to watch “Avengers”. So, this is how a decision tree classifier works.

Now that we have understood what exactly is supervised learning, let’s move ahead in this blog on machine learning tutorial and understand unsupervised learning.

Unsupervised Learning

In unsupervised Learning the machine learns from unlabeled data, i.e. the result for the input data is not known beforehand. Here, the algorithm tries to determine the underlying structure of the data.

Now, let’s go through this example to see how does unsupervised learning work.

Here, we have a bunch of fruits and none of these fruits have labels associated with them. Now, let’s take these fruits and feed them to an unsupervised learning model. So, the model determines the features associated with the data and understands that all the apples are similar in nature and thus groups them together. Similarly, it understands that all the bananas have the same features and thus group them together and same is the case with all the mangoes.

So, you need to understand that, even though there are no class labels associated with the data, the model was able to group them into different clusters on the basis of similarity of the data.

These are some unsupervised learning algorithms:

K-means clustering

Hierarchical Clustering

Principal Component Analysis

Further in this machine learning tutorial, we go through the next type of machine learning algorithm – Semi-supervised learning.

Machine Learning Tutorial: Semi-Supervised Learning

In semi-supervised learning the machine learns from a combination of labelled and unlabeled data, i.e. you can consider semi-supervised learning to be an amalgamation of both supervised learning and unsupervised learning.

Let’s go through this example. Here, we have a bunch of different items -> phones, apples, books and chairs. Now, as you see over here, only a minor proportion of the items are labelled and the rest are unlabeled. Here, the basic idea is to start off by grouping similar data together. So, all the phones would be put into one group, apples into another and same is the case with books and chairs.

Now we have four clusters containing similar data in them. Here, the algorithm assumes that all the data points which are in proximity tend to have the same label associated with them. Now, the semi-supervised algorithm uses the existing labelled data to assign labels to the rest of the unlabeled data.

So, this is the underlying concept of semi-supervised learning. Now, in this machine learning tutorial, let’s head onto the final type of machine learning algorithm, which is re-inforcement learning

Re-inforcement Learning

In re-inforcement learning the algorithm learns through a system of rewards and punishment and the goal here is to maximize the total reward. So, let’s go through this example to understand re-inforcement learning.

CAR-1-Intellipaat

So, here we have a self-driving car which is supposed to reach its destination without hitting any barricades. So, here, the self-driving car is the agent and the road is the environment.CAR-2-Intellipaat

Now, the car takes an action and goes straight, but when it goes straight, it directly hits the barricade. Now, since the car has taken a wrong action, it will be punished.

CAR-3-Intellipaat

So, the car realizes that going straight is wrong and it has to go right. So, when it goes right, it will be given a reward. So, this process continues and the car learns how to drive by itself without hitting any barricades.

And this brings us to the end of this “Machine Learning Tutorial”. We comprehensively understood what is machine learning and then we looked at the types of machine learning.

Now, if you are interested in doing an end-to-end certification course in Machine Learning, you can check out Intellipaat’s Machine Learning Course with Python.

Originally published at www.intellipaat.com on August 26, 2019.

#machine-learning

18.45 GEEK