What is machine learning?

What is machine learning?

This Tutorial, I will guide you What is Machine Learning?

Contents

This post is a part of a series of posts that I will be making. You can read a more detailed version of this post on my personal blog by clicking here. Underneath you can see an overview of the series.

A little bit of history

It seems, most people derive their definition of machine learning from a quote from Arthur Lee Samuel in 1959: “Programming computers to learn from experience should eventually eliminate the need for much of this detailed programming effort.” The interpretation to take away from this is that “machine learning is the field of study that gives computers the ability to learn without being explicitly programmed.”

Machine learning draws a lot of its methods from statistics, but there is a distinctive difference between the two areas: statistics is mainly concerned with estimation, whereas machine learning is mainly concerned with prediction. This distinction makes for great differences, as we will see soon enough.

Categories of machine learning

There are many different machine learning methods that solve different tasks and putting them all in rigid categories can be quite a task on its own. My posts will cover 2 fundamental ones; supervised learning and unsupervised learning, which can further be divided into smaller categories ash shown in the image above.

It’s important to note that these categories are not strict, e.g. dimensionality reduction isn’t always unsupervised, and you can use density estimation for clustering and classification.

Supervised learning

Supervised learning refers to a subset of machine learning tasks, where we’re given a dataset of N input-output pairs, and our goal is to come up with a function h from the inputs to the outputs. Each input variable variable is a D-dimensional vector (or a scalar), representing the observation with numerical values. The different dimensions of the input variable are commonly called features or attributes. Likewise, each target variable is most often a scalar.

In classification the possible values for the target variables form a finite number of discrete categories commonly called classes. A classic example is recognizing handwritten digits [1]. Given an image of 28×28 pixels, we can represent each image as a 784-dimensional vector, which will be our input variable, and our target variables will be scalars from 0 to 9 each representing a distinct digit.

You might’ve heard of regression before. Like classification, we are given a target variable, but in regression it is continuous instead of discrete. An example of regression could be predicting how much a house will be sold for. In this case, the features could be any measurements about the house, the location, and/or what other similar houses have been sold for recently — the target variable is the selling price of the house.

Unsupervised learning

Another subset of machine learning tasks fall under unsupervised learning, where we’re only given a dataset of N input variables. In contrast to supervised learning, we’re not told what we want to predict, i.e., we’re not given any target variables. The goal of unsupervised learning is then to find patterns in the data.

The image of categories above divides unsupervised learning into 3 subtasks, the first one being clustering, which, as the name suggests, refers to the task of discovering ‘clusters’ in the data. We can define a cluster to be a group of observations that are more similar to each other than to observations in other clusters. Let’s say we had to come up with clusters for a basketball, a carrot, and an apple. Firstly, we could create clusters based on shapes, in which case the basketball and the apple are both round, but the carrot isn’t. Secondly, we could also cluster by use, in which case the carrot and apple are foods, but the basketball isn’t. Finally, we might cluster by colour, in which case the basketball and the carrot are both orange, but the apple isn’t. All three are examples are valid clusters, but they’re clustering different things.

Then we have density estimation, which is the task of fitting probability density functions to the data. It’s important to note that density estimation is often used in conjunction to other tasks like classification, e.g. based on the given classes of our observations, we can use density estimation to find the distributions of each class and thereby (based on the class distributions) classify new observations. An example of density estimation could be finding extreme outliers in data, i.e., finding data that are highly unlikely to be generated from the density function you fit to the data.

Finally, dimensionality reduction, as the name suggests, reduces the number of features of the data that we’re dealing with. Just like density estimation, this is often done in conjunction with other tasks. Let’s say, we were going to do a classification task, and our input variables have 50 features — if we could do the same task equally well after reducing the number of features to 5, we could save a lot of time on computation.

Example: polynomial regression

Let’s go through an example of machine learning. This is also to get familiar with the machine learning terminology. We’re going to implement a model called polynomial regression, where we try and fit a polynomial to our data.

Given a training dataset of N 1-dimensional input variables x with corresponding target variables t, our objective is to fit a polynomial that yields values for future target variables given new input variables. We’ll do this by estimating the coefficients of the polynomial

which we refer to as the parameters or weights of our model. M _is the order of our polynomial, and w denotes all our parameters, i.e., we have _M+1 parameters for our Mth order polynomial.

Now, the objective is to estimate the ‘best’ values for our parameters. To do this, we define what is called an objective function (also sometimes called error or loss function). We construct our objective function such that it outputs a value that tells us how our model is performing. For this task, we define the objective function as the sum of the squared differences between the predictions of our polynomial and the corresponding target variables, i.e.

programming deep-learning machine-learning python

What is Geek Coin

What is GeekCash, Geek Token

Best Visual Studio Code Themes of 2021

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

top 30 Python Tips and Tricks for Beginners

In this post, we'll learn top 30 Python Tips and Tricks for Beginners

Machine Learning Vs Deep Learning: Difference Between Machine Learning and Deep Learning

This article will simply explain the concept which will help you understand the difference between Machine Learning and Deep Learning. 

Hire Machine Learning Engineer | Offshore Machine Learning Experts

We are a Machine Learning Services provider offering custom AI solutions, Machine Learning as a service & deep learning solutions. Hire Machine Learning experts & build AI Chatbots, Neural networks, etc. 16+ yrs & 2500+ clients.

Top Machine Learning Projects in Python For Beginners [2021] | upGrad blog

If you want to become a machine learning professional, you’d have to gain experience using its technologies and also by completing projects. Top Machine Learning Projects in Python For Beginners [2021]

Top Machine Learning Projects in Python For Beginners [2021]

If you want to become a machine learning professional, you’d have to gain experience using its technologies. The best way to do so is by completing projects. Take a look at this article and we will help you become an expert