An Introduction to Machine Learning for Beginners

An Introduction to Machine Learning for Beginners

In this tutorial, we’ll look into the common machine learning methods of supervised and unsupervised learning, and common algorithmic approaches in machine learning, including the k-nearest neighbor algorithm, decision tree learning, and deep learning.

In this tutorial, we’ll look into the common machine learning methods of supervised and unsupervised learning, and common algorithmic approaches in machine learning, including the k-nearest neighbor algorithm, decision tree learning, and deep learning.

Introduction Machine Learning

Machine learning is a subfield of artificial intelligence (AI). The goal of machine learning generally is to understand the structure of data and fit that data into models that can be understood and utilized by people.

Although machine learning is a field within computer science, it differs from traditional computational approaches. In traditional computing, algorithms are sets of explicitly programmed instructions used by computers to calculate or problem solve. Machine learning algorithms instead allow for computers to train on data inputs and use statistical analysis in order to output values that fall within a specific range. Because of this, machine learning facilitates computers in building models from sample data in order to automate decision-making processes based on data inputs.

Any technology user today has benefitted from machine learning. Facial recognition technology allows social media platforms to help users tag and share photos of friends. Optical character recognition (OCR) technology converts images of text into movable type. Recommendation engines, powered by machine learning, suggest what movies or television shows to watch next based on user preferences. Self-driving cars that rely on machine learning to navigate may soon be available to consumers.

Machine learning is a continuously developing field. Because of this, there are some considerations to keep in mind as you work with machine learning methodologies, or analyze the impact of machine learning processes.

In this tutorial, we’ll look into the common machine learning methods of supervised and unsupervised learning, and common algorithmic approaches in machine learning, including the k-nearest neighbor algorithm, decision tree learning, and deep learning. We’ll explore which programming languages are most used in machine learning, providing you with some of the positive and negative attributes of each. Additionally, we’ll discuss biases that are perpetuated by machine learning algorithms, and consider what can be kept in mind to prevent these biases when building algorithms.

Machine Learning Methods

In machine learning, tasks are generally classified into broad categories. These categories are based on how learning is received or how feedback on the learning is given to the system developed.

Two of the most widely adopted machine learning methods are supervised learning which trains algorithms based on example input and output data that is labeled by humans, and unsupervised learning which provides the algorithm with no labeled data in order to allow it to find structure within its input data. Let’s explore these methods in more detail.

Supervised Learning

In supervised learning, the computer is provided with example inputs that are labeled with their desired outputs. The purpose of this method is for the algorithm to be able to “learn” by comparing its actual output with the “taught” outputs to find errors, and modify the model accordingly. Supervised learning therefore uses patterns to predict label values on additional unlabeled data.

For example, with supervised learning, an algorithm may be fed data with images of sharks labeled as fish and images of oceans labeled as water. By being trained on this data, the supervised learning algorithm should be able to later identify unlabeled shark images as fish and unlabeled ocean images as water.

A common use case of supervised learning is to use historical data to predict statistically likely future events. It may use historical stock market information to anticipate upcoming fluctuations, or be employed to filter out spam emails. In supervised learning, tagged photos of dogs can be used as input data to classify untagged photos of dogs.

Unsupervised Learning

In unsupervised learning, data is unlabeled, so the learning algorithm is left to find commonalities among its input data. As unlabeled data are more abundant than labeled data, machine learning methods that facilitate unsupervised learning are particularly valuable.

The goal of unsupervised learning may be as straightforward as discovering hidden patterns within a dataset, but it may also have a goal of feature learning, which allows the computational machine to automatically discover the representations that are needed to classify raw data.

Unsupervised learning is commonly used for transactional data. You may have a large dataset of customers and their purchases, but as a human you will likely not be able to make sense of what similar attributes can be drawn from customer profiles and their types of purchases. With this data fed into an unsupervised learning algorithm, it may be determined that women of a certain age range who buy unscented soaps are likely to be pregnant, and therefore a marketing campaign related to pregnancy and baby products can be targeted to this audience in order to increase their number of purchases.

Without being told a “correct” answer, unsupervised learning methods can look at complex data that is more expansive and seemingly unrelated in order to organize it in potentially meaningful ways. Unsupervised learning is often used for anomaly detection including for fraudulent credit card purchases, and recommender systems that recommend what products to buy next. In unsupervised learning, untagged photos of dogs can be used as input data for the algorithm to find likenesses and classify dog photos together.

Approaches

As a field, machine learning is closely related to computational statistics, so having a background knowledge in statistics is useful for understanding and leveraging machine learning algorithms.

For those who may not have studied statistics, it can be helpful to first define correlation and regression, as they are commonly used techniques for investigating the relationship among quantitative variables. Correlation is a measure of association between two variables that are not designated as either dependent or independent. Regression at a basic level is used to examine the relationship between one dependent and one independent variable. Because regression statistics can be used to anticipate the dependent variable when the independent variable is known, regression enables prediction capabilities.

Approaches to machine learning are continuously being developed. For our purposes, we’ll go through a few of the popular approaches that are being used in machine learning at the time of writing.

k-nearest neighbor

The k-nearest neighbor algorithm is a pattern recognition model that can be used for classification as well as regression. Often abbreviated as k-NN, the k in k-nearest neighbor is a positive integer, which is typically small. In either classification or regression, the input will consist of the k closest training examples within a space.

We will focus on k-NN classification. In this method, the output is class membership. This will assign a new object to the class most common among its k nearest neighbors. In the case of k = 1, the object is assigned to the class of the single nearest neighbor.

Let’s look at an example of k-nearest neighbor. In the diagram below, there are blue diamond objects and orange star objects. These belong to two separate classes: the diamond class and the star class.

When a new object is added to the space — in this case a green heart — we will want the machine learning algorithm to classify the heart to a certain class.

When we choose k = 3, the algorithm will find the three nearest neighbors of the green heart in order to classify it to either the diamond class or the star class.

In our diagram, the three nearest neighbors of the green heart are one diamond and two stars. Therefore, the algorithm will classify the heart with the star class.

Among the most basic of machine learning algorithms, k-nearest neighbor is considered to be a type of “lazy learning” as generalization beyond the training data does not occur until a query is made to the system.

Decision Tree Learning

For general use, decision trees are employed to visually represent decisions and show or inform decision making. When working with machine learning and data mining, decision trees are used as a predictive model. These models map observations about data to conclusions about the data’s target value.

The goal of decision tree learning is to create a model that will predict the value of a target based on input variables.

In the predictive model, the data’s attributes that are determined through observation are represented by the branches, while the conclusions about the data’s target value are represented in the leaves.

When “learning” a tree, the source data is divided into subsets based on an attribute value test, which is repeated on each of the derived subsets recursively. Once the subset at a node has the equivalent value as its target value has, the recursion process will be complete.

Let’s look at an example of various conditions that can determine whether or not someone should go fishing. This includes weather conditions as well as barometric pressure conditions.

In the simplified decision tree above, an example is classified by sorting it through the tree to the appropriate leaf node. This then returns the classification associated with the particular leaf, which in this case is either a Yes or a No. The tree classifies a day’s conditions based on whether or not it is suitable for going fishing.

A true classification tree data set would have a lot more features than what is outlined above, but relationships should be straightforward to determine. When working with decision tree learning, several determinations need to be made, including what features to choose, what conditions to use for splitting, and understanding when the decision tree has reached a clear ending.

Deep Learning

Deep learning attempts to imitate how the human brain can process light and sound stimuli into vision and hearing. A deep learning architecture is inspired by biological neural networks and consists of multiple layers in an artificial neural network made up of hardware and GPUs.

Deep learning uses a cascade of nonlinear processing unit layers in order to extract or transform features (or representations) of the data. The output of one layer serves as the input of the successive layer. In deep learning, algorithms can be either supervised and serve to classify data, or unsupervised and perform pattern analysis.

Among the machine learning algorithms that are currently being used and developed, deep learning absorbs the most data and has been able to beat humans in some cognitive tasks. Because of these attributes, deep learning has become the approach with significant potential in the artificial intelligence space

Computer vision and speech recognition have both realized significant advances from deep learning approaches. IBM Watson is a well-known example of a system that leverages deep learning.

Programming Languages

When choosing a language to specialize in with machine learning, you may want to consider the skills listed on current job advertisements as well as libraries available in various languages that can be used for machine learning processes.

From data taken from job ads on indeed.com in December 2016, it can be inferred that Python is the most sought-for programming language in the machine learning professional field. Python is followed by Java, then R, then C++.

Python’s popularity may be due to the increased development of deep learning frameworks available for this language recently, including TensorFlow, PyTorch, and Keras. As a language that has readable syntax and the ability to be used as a scripting language, Python proves to be powerful and straightforward both for preprocessing data and working with data directly. The scikit-learn machine learning library is built on top of several existing Python packages that Python developers may already be familiar with, namely NumPy, SciPy, and Matplotlib.

To get started with Python, you can read our tutorial series on “How To Code in Python 3,” or read specifically on “How To Build a Machine Learning Classifier in Python with scikit-learn” or “How To Perform Neural Style Transfer with Python 3 and PyTorch.”

Java is widely used in enterprise programming, and is generally used by front-end desktop application developers who are also working on machine learning at the enterprise level. Usually it is not the first choice for those new to programming who want to learn about machine learning, but is favored by those with a background in Java development to apply to machine learning. In terms of machine learning applications in industry, Java tends to be used more than Python for network security, including in cyber attack and fraud detection use cases.

Among machine learning libraries for Java are Deeplearning4j, an open-source and distributed deep-learning library written for both Java and Scala; MALLET (MAchine Learning for LanguagE Toolkit) allows for machine learning applications on text, including natural language processing, topic modeling, document classification, and clustering; and Weka, a collection of machine learning algorithms to use for data mining tasks.

R is an open-source programming language used primarily for statistical computing. It has grown in popularity over recent years, and is favored by many in academia. R is not typically used in industry production environments, but has risen in industrial applications due to increased interest in data science. Popular packages for machine learning in R include caret (short for Classification And REgression Training) for creating predictive models, randomForest for classification and regression, and e1071 which includes functions for statistics and probability theory.

C++ is the language of choice for machine learning and artificial intelligence in game or robot applications (including robot locomotion). Embedded computing hardware developers and electronics engineers are more likely to favor C++ or C in machine learning applications due to their proficiency and level of control in the language. Some machine learning libraries you can use with C++ include the scalable mlpack, Dlib offering wide-ranging machine learning algorithms, and the modular and open-source Shark.

Human Biases

Although data and computational analysis may make us think that we are receiving objective information, this is not the case; being based on data does not mean that machine learning outputs are neutral. Human bias plays a role in how data is collected, organized, and ultimately in the algorithms that determine how machine learning will interact with that data.

If, for example, people are providing images for “fish” as data to train an algorithm, and these people overwhelmingly select images of goldfish, a computer may not classify a shark as a fish. This would create a bias against sharks as fish, and sharks would not be counted as fish.

When using historical photographs of scientists as training data, a computer may not properly classify scientists who are also people of color or women. In fact, recent peer-reviewed research has indicated that AI and machine learning programs exhibit human-like biases that include race and gender prejudices. See, for example “Semantics derived automatically from language corpora contain human-like biases” and “Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints” [PDF].

As machine learning is increasingly leveraged in business, uncaught biases can perpetuate systemic issues that may prevent people from qualifying for loans, from being shown ads for high-paying job opportunities, or from receiving same-day delivery options.

Because human bias can negatively impact others, it is extremely important to be aware of it, and to also work towards eliminating it as much as possible. One way to work towards achieving this is by ensuring that there are diverse people working on a project and that diverse people are testing and reviewing it. Others have called for regulatory third parties to monitor and audit algorithms, building alternative systems that can detect biases, and ethics reviews as part of data science project planning. Raising awareness about biases, being mindful of our own unconscious biases, and structuring equity in our machine learning projects and pipelines can work to combat bias in this field.

Conclusion

This tutorial reviewed some of the use cases of machine learning, common methods and popular approaches used in the field, suitable machine learning programming languages, and also covered some things to keep in mind in terms of unconscious biases being replicated in algorithms.

Because machine learning is a field that is continuously being innovated, it is important to keep in mind that algorithms, methods, and approaches will continue to change.

Artificial Intelligence vs. Machine Learning vs. Deep Learning

Artificial Intelligence vs. Machine Learning vs. Deep Learning

Artificial Intelligence vs. Machine Learning vs. Deep Learning. We are going to discuss we difference between Artificial Intelligence, Machine Learning, and Deep Learning

We are going to discuss we difference between Artificial Intelligence, Machine Learning, and Deep Learning

Furthermore, we will address the question of why Deep Learning as a young emerging field is far superior to traditional Machine Learning

Artificial Intelligence, Machine Learning, and Deep Learning are popular buzzwords that everyone seems to use nowadays.

But still, there is a big misconception among many people about the meaning of these terms.

In the worst case, one may think that these terms describe the same thing — which is simply false.

A large number of companies claim nowadays to incorporate some kind of “ Artificial Intelligence” (AI) in their applications or services.

But artificial intelligence is only a broader term that describes applications when a machine mimics “ cognitive “ functions that humans associate with other human minds, such as “learning” and “problem-solving”.

On a lower level, an AI can be only a programmed rule that determines the machine to behave in a certain way in certain situations. So basically Artificial Intelligence can be nothing more than just a bunch of if-else statements.

An if-else statement is a simple rule explicitly programmed by a human. Consider a very abstract, simple example of a robot who is moving on a road. A possible programmed rule for that robot could look as follows:

Instead, when speaking of Artificial Intelligence it's only worthwhile to consider two different approaches: Machine Learning and Deep Learning. Both are subfields of Artificial Intelligence

Machine Learning vs Deep Learning

Now that we now better understand what Artificial Intelligence means we can take a closer look at Machine Learning and Deep Learning and make a clearer distinguishment between these two.

Machine Learning incorporates “ classical” algorithms for various kinds of tasks such as clustering, regression or classification. Machine Learning algorithms must be trained on data. The more data you provide to your algorithm, the better it gets.

The “training” part of a Machine Learning model means that this model tries to optimize along a certain dimension. In other words, the Machine Learning models try to minimize the error between their predictions and the actual ground truth values.

For this we must define a so-called error function, also called a loss-function or an objective function … because after all the model has an objective. This objective could be for example classification of data into different categories (e.g. cat and dog pictures) or prediction of the expected price of a stock in the near future.

When someone says they are working with a machine-learning algorithm, you can get to the gist of its value by asking: What’s the objective function?

At this point, you may ask: How do we minimize the error?

One way would be to compare the prediction of the model with the ground truth value and adjust the parameters of the model in a way so that next time, the error between these two values is smaller. This is repeated again and again and again.

Thousands and millions of times, until the parameters of the model that determine the predictions are so good, that the difference between the predictions of the model and the ground truth labels are as small as possible.

In short machine learning models are optimization algorithms. If you tune them right, they minimize their error by guessing and guessing and guessing again.

Machine Learning is old…

The basic definition of machine learning is:

Algorithms that analyze data, learn from it and make informed decisions based on the learned insights.

Machine learning leads to a variety of automated tasks. It affects virtually every industry — from IT security malware search, to weather forecasting, to stockbrokers looking for cheap trades. Machine learning requires complex math and a lot of coding to finally get the desired functions and results.

Machine learning algorithms need to be trained on large amounts of data.
The more data you provide for your algorithm, the better it gets.

Machine Learning is a pretty old field and incorporates methods and algorithms that have been around for dozens of years, some of them since as early as the sixties.

These classic algorithms include algorithms such as the so-called Naive Bayes Classifier and the Support Vector Machines. Both are often used in the classification of data.

In addition to the classification, there are also cluster analysis algorithms such as the well-known K-Means and the tree-based clustering. To reduce the dimensionality of data and gain more insight into its nature, machine learning uses methods such as principal component analysis and tSNE.

Deep Learning — The next big Thing

Now let’s focus on the essential thing that is at stake here. On deep learning.
Deep Learning is a very young field of artificial intelligence based on artificial neural networks.

Again, deep learning can be seen as a part of machine learning because deep learning algorithms also need data to learn how to solve problems. Therefore, the terms of machine learning and deep learning are often treated as the same. However, these systems have different capabilities.

Deep Learning uses a multi-layered structure of algorithms called the neural network:

It can be viewed again as a subfield of Machine Learning since Deep Learning algorithms also require data in order to learn to solve tasks. Although methods of Deep Learning are able to perform the same tasks as classic Machine Learning algorithms, it is not the other way round.

Artificial neural networks have unique capabilities that enable Deep Learning models to solve tasks that Machine Learning models could never solve.

All recent advances in intelligence are due to Deep Learning. Without Deep Learning we would not have self-driving cars, chatbots or personal assistants like Alexa and Siri. Google Translate app would remain primitive and Netflix would have no idea which movies or TV series we like or dislike.

We can even go so far as to say that the new industrial revolution is driven by artificial neural networks and Deep Learning. This is the best and closest approach to true machine intelligence we have so far. The reason is that Deep Learning has two major advantages over Machine Learning.

Why is Deep Learning better than Machine Learning?

Feature Extraction

The first advantage of Deep Learning over machine learning is the needlessness of the so-called Feature Extraction.

Long before deep learning was used, traditional machine learning methods were popular, such as Decision Trees, SVM, Naïve Bayes Classifier and Logistic Regression. These algorithms are also called “flat algorithms”.

Flat means here that these algorithms can not normally be applied directly to the raw data (such as .csv, images, text, etc.). We require a preprocessing step called Feature Extraction.

The result of Feature Extraction is an abstract representation of the given raw data that can now be used by these classic machine learning algorithms to perform a task. For example, the classification of the data into several categories or classes.

Feature Extraction is usually pretty complicated and requires detailed knowledge of the problem domain. This step must be adapted, tested and refined over several iterations for optimal results.

On the other side are the artificial neural networks. These do not require the step of feature extraction. The layers are able to learn an implicit representation of the raw data directly on their own.

Here, a more and more abstract and compressed representation of the raw data is produced over several layers of an artificial neural network. This compressed representation of the input data is then used to produce the result. The result can be, for example, the classification of the input data into different classes.

In other words, we can also say that the feature extraction step is already a part of the process that takes place in an artificial neural network. During the training process, this step is also optimized by the neural network to obtain the best possible abstract representation of the input data. This means that the models of deep learning thus require little to no manual effort to perform and optimize the feature extraction process.

For example, if you want to use a machine learning model to determine whether a particular image shows a car or not, we humans first need to identify the unique features of a car (shape, size, windows, wheels, etc.), extract these features and give them to the algorithm as input data. This way, the machine learning algorithm would perform a classification of the image. That is, in machine learning, a programmer must intervene directly in the classification process.

In the case of a deep learning model, the feature extraction step is completely unnecessary. The model would recognize these unique characteristics of a car and make correct predictions- completely without the help of a human.

In fact, this applies to every other task you’ll ever do with neural networks.
They just give the raw data to the neural network, the rest is done by the model.

The Era of Big Data…

The second huge advantage of Deep Learning and a key part in understanding why it’s becoming so popular is that it’s powered by massive amounts of data. The “Big Data Era” of technology will provide huge amounts of opportunities for new innovations in deep learning. To quote Andrew Ng, the chief scientist of China’s major search engine Baidu and one of the leaders of the Google Brain Project:

The analogy to deep learning is that the rocket engine is the deep learning models and the fuel is the huge amounts of data we can feed to these algorithms.

Deep Learning models scale better with a larger amount of data

Deep Learning models tend to increase their accuracy with the increasing amount of training data, where’s traditional machine learning models such as SVM and Naive Bayes classifier stop improving after a saturation point.

Special Announcement: We just released a free Course on Deep Learning!

I am the founder of DeepLearning Academy, an advanced Deep Learning education platform. We provide practical state-of-the-art Deep Learning education and mentoring to professionals and beginners.

Among our things we just released a free Introductory Course on Deep Learning with TensorFlow, where you can learn how to implement Neural Networks from Scratch for various use-cases using TensorFlow.

If you are interested in this topic, feel free to check it out ;)

Artificial Intelligence vs. Machine Learning vs. Deep Learning

Artificial Intelligence vs. Machine Learning vs. Deep Learning

Learn the Difference between the most popular Buzzwords in today's tech. World — AI, Machine Learning and Deep Learning

In this article, we are going to discuss we difference between Artificial Intelligence, Machine Learning, and Deep Learning.

Furthermore, we will address the question of why Deep Learning as a young emerging field is far superior to traditional Machine Learning.

Artificial Intelligence, Machine Learning, and Deep Learning are popular buzzwords that everyone seems to use nowadays.

But still, there is a big misconception among many people about the meaning of these terms.

In the worst case, one may think that these terms describe the same thing — which is simply false.

A large number of companies claim nowadays to incorporate some kind of “ Artificial Intelligence” (AI) in their applications or services.

But artificial intelligence is only a broader term that describes applications when a machine mimics “ cognitive “ functions that humans associate with other human minds, such as “learning” and “problem-solving”.

On a lower level, an AI can be only a programmed rule that determines the machine to behave in a certain way in certain situations. So basically Artificial Intelligence can be nothing more than just a bunch of if-else statements.

An if-else statement is a simple rule explicitly programmed by a human. Consider a very abstract, simple example of a robot who is moving on a road. A possible programmed rule for that robot could look as follows:

Instead, when speaking of Artificial Intelligence it's only worthwhile to consider two different approaches: Machine Learning and Deep Learning. Both are subfields of Artificial Intelligence

Machine Learning vs Deep Learning

Now that we now better understand what Artificial Intelligence means we can take a closer look at Machine Learning and Deep Learning and make a clearer distinguishment between these two.

Machine Learning incorporates “ classical” algorithms for various kinds of tasks such as clustering, regression or classification. Machine Learning algorithms must be trained on data. The more data you provide to your algorithm, the better it gets.

The “training” part of a Machine Learning model means that this model tries to optimize along a certain dimension. In other words, the Machine Learning models try to minimize the error between their predictions and the actual ground truth values.

For this we must define a so-called error function, also called a loss-function or an objective function … because after all the model has an objective. This objective could be for example classification of data into different categories (e.g. cat and dog pictures) or prediction of the expected price of a stock in the near future.

When someone says they are working with a machine-learning algorithm, you can get to the gist of its value by asking: What’s the objective function?

At this point, you may ask: How do we minimize the error?

One way would be to compare the prediction of the model with the ground truth value and adjust the parameters of the model in a way so that next time, the error between these two values is smaller. This is repeated again and again and again.

Thousands and millions of times, until the parameters of the model that determine the predictions are so good, that the difference between the predictions of the model and the ground truth labels are as small as possible.

In short machine learning models are optimization algorithms. If you tune them right, they minimize their error by guessing and guessing and guessing again.

Machine Learning is old…

Machine Learning is a pretty old field and incorporates methods and algorithms that have been around for dozens of years, some of them since as early as the sixties.

Some known methods of classification and prediction are the Naive Bayes Classifier and the Support Vector Machines. In addition to the classification, there are also clustering algorithms such as the well-known K-Means and tree-based clustering. To reduce the dimensionality of data to gain more insights about it’ nature methods such as Principal component analysis and tSNE are used.

Deep Learning — The next big Thing

Deep Learning, on the other hand, is a very young field of Artificial Intelligence that is powered by artificial neural networks.

It can be viewed again as a subfield of Machine Learning since Deep Learning algorithms also require data in order to learn to solve tasks. Although methods of Deep Learning are able to perform the same tasks as classic Machine Learning algorithms, it is not the other way round.

Artificial neural networks have unique capabilities that enable Deep Learning models to solve tasks that Machine Learning models could never solve.

All recent advances in intelligence are due to Deep Learning. Without Deep Learning we would not have self-driving cars, chatbots or personal assistants like Alexa and Siri. Google Translate app would remain primitive and Netflix would have no idea which movies or TV series we like or dislike.

We can even go so far as to say that the new industrial revolution is driven by artificial neural networks and Deep Learning. This is the best and closest approach to true machine intelligence we have so far. The reason is that Deep Learning has two major advantages over Machine Learning.

Why is Deep Learning better than Machine Learning?

The first advantage is the needlessness of Feature Extraction. What do I mean by this?

Well if you want to use a Machine Learning model to determine whether a given picture shows a car or not, we as humans, must first program the unique features of a car (shape, size, windows, wheels etc.) into the algorithm. This way the algorithm would know what to look after in the given pictures.

In the case of a Deep Learning model, is step is completely unnecessary. The model would recognize all the unique characteristics of a car by itself and make correct predictions.

In fact, the needlessness of feature extraction applies to any other task for a deep learning model. You simply give the neural network the raw data, the rest is done by the model. While for a machine learning model, you would need to perform additional steps, such as the already mentioned extraction of the features of the given data.

The second huge advantage of Deep Learning and a key part in understanding why it’s becoming so popular is that it’s powered by massive amounts of data. The “Big Data Era” of technology will provide huge amounts of opportunities for new innovations in deep learning. To quote Andrew Ng, the chief scientist of China’s major search engine Baidu and one of the leaders of the Google Brain Project:

The analogy to deep learning is that the rocket engine is the deep learning models and the fuel is the huge amounts of data we can feed to these algorithms.

Deep Learning models tend to increase their accuracy with the increasing amount of training data, where’s traditional machine learning models such as SVM and Naive Bayes classifier stop improving after a saturation point.

Machine Learning vs Deep Learning: What's the Difference?

Machine Learning vs Deep Learning: What's the Difference?

In this post talks about the differences and relationship between Artificial intelligence (AI), Machine Learning (ML) and Deep Learning (DL)

In this post talks about the differences and relationship between Artificial intelligence (AI), Machine Learning (ML) and Deep Learning (DL)

**In this tutorial, you will learn: **

  • What is Artificial intelligence (AI)?
  • What is Machine Learning (ML)?
  • What is Deep Learning (DL)?
  • Machine Learning Process
  • Deep Learning Process
  • Automate Feature Extraction using DL
  • Difference between Machine Learning and Deep Learning
  • When to use ML or DL?
What is AI?

Artificial intelligence is imparting a cognitive ability to a machine. The benchmark for **AI **is the human intelligence regarding reasoning, speech, and vision. This benchmark is far off in the future.

AI has three different levels:

  1. Narrow AI: A Artificial intelligence is said to be narrow when the machine can perform a specific task better than a human. The current research of AI is here now
  2. General AI: An artificial intelligence reaches the general state when it can perform any intellectual task with the same accuracy level as a human would
  3. Active AI: An AI is active when it can beat humans in many tasks

Early AI systems used pattern matching and expert systems.

What is ML?

Machine learning is the best tool so far to analyze, understand and identify a pattern in the data. One of the main ideas behind Machine Learning is that the computer can be trained to automate tasks that would be exhaustive or impossible for a human being. The clear breach from the traditional analysis is that machine learning can take decisions with minimal human intervention.

Machine learning uses data to feed an algorithm that can understand the relationship between the input and the output. When the machine finished learning, it can predict the value or the class of new data point.

What is Deep Learning?

Deep learning is a computer software that mimics the network of neurons in a brain. It is a subset of machine learning and is called deep learning because it makes use of deep neural networks. The machine uses *different *layers to *learn *from the data. The depth of the model is represented by the number of layers in the model. Deep learning is the new state of the art in term of AI. In deep learning, the learning phase is done through a neural network. A neural network is an architecture where the layers are stacked on top of each other

Machine Learning Process

Imagine you are meant to build a program that recognizes objects. To train the model, you will use a classifier. A classifier uses the features of an object to try identifying the class it belongs to.

In the example, the classifier will be trained to detect if the image is a:

  • What is Artificial intelligence (AI)?
  • What is Machine Learning (ML)?
  • What is Deep Learning (DL)?
  • Machine Learning Process
  • Deep Learning Process
  • Automate Feature Extraction using DL
  • Difference between Machine Learning and Deep Learning
  • When to use ML or DL?

The four objects above are the class the classifier has to recognize. To construct a classifier, you need to have some data as input and assigns a label to it. The algorithm will take these data, find a pattern and then classify it in the corresponding class.

This task is called supervised learning. In supervised learning, the training data you feed to the algorithm includes a label.

Training an algorithm requires to follow a few standard steps:

  • What is Artificial intelligence (AI)?
  • What is Machine Learning (ML)?
  • What is Deep Learning (DL)?
  • Machine Learning Process
  • Deep Learning Process
  • Automate Feature Extraction using DL
  • Difference between Machine Learning and Deep Learning
  • When to use ML or DL?

The first step is necessary, choosing the right data will make the algorithm success or a failure. The data you choose to train the model is called a feature. In the object example, the features are the pixels of the images.

Each image is a row in the data while each pixel is a column. If your image is a 28x28 size, the dataset contains 784 columns (28x28). In the picture below, each picture has been transformed into a feature vector. The label tells the computer what object is in the image.

The objective is to use these training data to classify the type of object. The first step consists of creating the feature columns. Then, the second step involves choosing an algorithm to train the model. When the training is done, the model will predict what picture corresponds to what object.

After that, it is easy to use the model to predict new images. For each new image feeds into the model, the machine will predict the class it belongs to. For example, an entirely new image without a label is going through the model. For a human being, it is trivial to visualize the image as a car. The machine uses its previous knowledge to predict as well the image is a car.

Deep Learning Process

In deep learning, the *learning *phase is done through a neural network. A neural network is an architecture where the layers are stacked on top of each other.

Consider the same image example above. The training set would be fed to a neural network

Each input goes into a neuron and is multiplied by a weight. The result of the multiplication flows to the next layer and become the input. This process is repeated for each layer of the network. The final layer is named the output layer; it provides an actual value for the regression task and a probability of each class for the classification task. The neural network uses a mathematical algorithm to update the weights of all the neurons. The neural network is fully trained when the value of the weights gives an output close to the reality. For instance, a well-trained neural network can recognize the object on a picture with higher accuracy than the traditional neural net.

Automate Feature Extraction using DL

A dataset can contain a dozen to hundreds of features. The system will learn from the relevance of these features. However, not all features are meaningful for the algorithm. A crucial part of machine learning is to find a relevant set of features to make the system learns something.

One way to perform this part in machine learning is to use feature extraction. Feature extraction combines existing features to create a more relevant set of features. It can be done with PCA, T-SNE or any other dimensionality reduction algorithms.

For example, an image processing, the practitioner needs to extract the feature manually in the image like the eyes, the nose, lips and so on. Those extracted features are feed to the classification model.

Deep learning solves this issue, especially for a convolutional neural network. The first layer of a neural network will learn small details from the picture; the next layers will combine the previous knowledge to make more complex information. In the convolutional neural network, the feature extraction is done with the use of the filter. The network applies a filter to the picture to see if there is a match, i.e., the shape of the feature is identical to a part of the image. If there is a match, the network will use this filter. The process of feature extraction is therefore done automatically.

Difference between Machine Learning and Deep Learning

When to use ML or DL?

In the table below, we summarize ***the difference between machine learning and deep learning. ***

With machine learning, you need fewer data to train the algorithm than deep learning. Deep learning requires an extensive and diverse set of data to identify the underlying structure. Besides, machine learning provides a faster-trained model. Most advanced deep learning architecture can take days to a week to train. The advantage of deep learning over machine learning is it is highly accurate. You do not need to understand what features are the best representation of the data; the neural network learned how to select critical features. In machine learning, you need to choose for yourself what features to include in the model.

Summary

Artificial intelligence is imparting a cognitive ability to a machine. Early AI systems used pattern matching and expert systems.

The idea behind machine learning is that the machine can learn without human intervention. The machine needs to find a way to learn how to solve a task given the data.

Deep learning is the breakthrough in the field of artificial intelligence. When there is enough data to train on, **deep learning **achieves impressive results, especially for image recognition and text translation. The main reason is the feature extraction is done automatically in the different layers of the network.