Machine Learning algorithms explained

Machine Learning algorithms explained

Machine learning and deep learning have been widely embraced, and even more widely misunderstood. In this article, I’d like to step back and explain both machine learning and deep learning in basic terms, discuss some of the most common machine learning algorithms, and explain how those algorithms relate to the other pieces of the puzzle of creating predictive models from historical data.

Machine learning uses algorithms to turn a data set into a predictive model. Which algorithm works best depends on the problem

Machine learning and deep learning have been widely embraced, and even more widely misunderstood. In this article, I’d like to step back and explain both machine learning and deep learning in basic terms, discuss some of the most common machine learning algorithms, and explain how those algorithms relate to the other pieces of the puzzle of creating predictive models from historical data.

What are machine learning algorithms?

Recall that machine learning is a class of methods for automatically creating predictive models from data. Machine learning algorithms are the engines of machine learning, meaning it is the algorithms that turn a data set into a model. Which kind of algorithm works best (supervised, unsupervised, classification, regression, etc.) depends on the kind of problem you’re solving, the computing resources available, and the nature of the data.

How machine learning works

Ordinary programming algorithms tell the computer what to do in a straightforward way. For example, sorting algorithms turn unordered data into data ordered by some criteria, often the numeric or alphabetical order of one or more fields in the data.

Linear regression algorithms fit a straight line*, or another function that is linear in its parameters such as a polynomial,* to numeric data, typically by performing matrix inversions to minimize the squared error between the line and the data. Squared error is used as the metric because you don’t care whether the regression line is above or below the data points; you only care about the distance between the line and the points.

Nonlinear regression algorithms, which fit curves that are not linear in their parameters to data, are a little more complicated, because, unlike linear regression problems, they can’t be solved with a deterministic method. Instead, the nonlinear regression algorithms implement some kind of iterative minimization process, often some variation on the method of steepest descent.

Steepest descent basically computes the squared error and its gradient at the current parameter values, picks a step size (aka learning rate), follows the direction of the gradient “down the hill,” and then recomputes the squared error and its gradient at the new parameter values. Eventually, with luck, the process converges. The variants on steepest descent try to improve the convergence properties.

Machine learning algorithms are even less straightforward than nonlinear regression, partly because machine learning dispenses with the constraint of fitting to a specific mathematical function, such as a polynomial. There are two major categories of problems that are often solved by machine learning: regression and classification. Regression is for numeric data (e.g. What is the likely income for someone with a given address and profession?) and classification is for non-numeric data (e.g. Will the applicant default on this loan?).

Prediction problems (e.g. What will the opening price be for Microsoft shares tomorrow?) are a subset of regression problems for time series data. Classification problems are sometimes divided into binary (yes or no) and multi-category problems (animal, vegetable, or mineral).

Supervised learning vs. unsupervised learning

Independent of these divisions, there are another two kinds of machine learning algorithms: supervised and unsupervised. In supervised learning, you provide a training data set with answers, such as a set of pictures of animals along with the names of the animals. The goal of that training would be a model that could correctly identify a picture (of a kind of animal that was included in the training set) that it had not previously seen.

In unsupervised learning, the algorithm goes through the data itself and tries to come up with meaningful results. The result might be, for example, a set of clusters of data points that could be related within each cluster. That works better when the clusters don’t overlap.

Training and evaluation turn supervised learning algorithms into models by optimizing their parameters to find the set of values that best matches the ground truth of your data. The algorithms often rely on variants of steepest descent for their optimizers, for example stochastic gradient descent (SGD), which is essentially steepest descent performed multiple times from randomized starting points. Common refinements on SGD add factors that correct the direction of the gradient based on momentum or adjust the learning rate based on progress from one pass through the data (called an epoch) to the next.

Data cleaning for machine learning

There is no such thing as clean data in the wild. To be useful for machine learning, data must be aggressively filtered. For example, you’ll want to:

  1. Look at the data and exclude any columns that have a lot of missing data.
  2. Look at the data again and pick the columns you want to use for your prediction. (This is something you may want to vary when you iterate.)
  3. Exclude any rows that still have missing data in the remaining columns.
  4. Correct obvious typos and merge equivalent answers. For example, U.S., US, USA, and America should be merged into a single category.
  5. Exclude rows that have data that is out of range. For example, if you’re analyzing taxi trips within New York City, you’ll want to filter out rows with pick-up or drop-off latitudes and longitudes that are outside the bounding box of the metropolitan area.

There is a lot more you can do, but it will depend on the data collected. This can be tedious, but if you set up a data-cleaning step in your machine learning pipeline you can modify and repeat it at will.

Data encoding and normalization for machine learning

To use categorical data for machine classification, you need to encode the text labels into another form. There are two common encodings.

One is label encoding, which means that each text label value is replaced with a number. The other is one-hot encoding, which means that each text label value is turned into a column with a binary value (1 or 0). Most machine learning frameworks have functions that do the conversion for you. In general, one-hot encoding is preferred, as label encoding can sometimes confuse the machine learning algorithm into thinking that the encoded column is ordered.

To use numeric data for machine regression, you usually need to normalize the data. Otherwise, the numbers with larger ranges may tend to dominate the Euclidian distance between feature vectors, their effects can be magnified at the expense of the other fields, and the steepest descent optimization may have difficulty converging. There are a number of ways to normalize and standardize data for ML, including min-max normalization, mean normalization, standardization, and scaling to unit length. This process is often called feature scaling.

What are machine learning features?

Since I mentioned feature vectors in the previous section, I should explain what they are. First of all, a feature is an individual measurable property or characteristic of a phenomenon being observed. The concept of a “feature” is related to that of an explanatory variable, which is used in statistical techniques such as linear regression. Feature vectors combine all of the features for a single row into a numerical vector.

Part of the art of choosing features is to pick a minimum set of independent variables that explain the problem. If two variables are highly correlated, either they need to be combined into a single feature, or one should be dropped. Sometimes people perform principal component analysis to convert correlated variables into a set of linearly uncorrelated variables.

Some of the transformations that people use to construct new features or reduce the dimensionality of feature vectors are simple. For example, subtract Year of Birth from Year of Death and you construct Age at Death, which is a prime independent variable for lifetime and mortality analysis. In other cases, feature construction may not be so obvious.

Common machine learning algorithms

There are dozens of machine learning algorithms, ranging in complexity from linear regression and logistic regression to deep neural networks and ensembles (combinations of other models). However, some of the most common algorithms include:

  • Linear regression, aka least squares regression (for numeric data)
  • Logistic regression (for binary classification)
  • Linear discriminant analysis (for multi-category classification)
  • Decision trees (for both classification and regression)
  • Naïve Bayes (for both classification and regression)
  • K-Nearest Neighbors, aka KNN (for both classification and regression)
  • Learning Vector Quantization, aka LVQ (for both classification and regression)
  • Support Vector Machines, aka SVM (for binary classification)
  • Random Forests, a type of “bagging” ensemble algorithm (for both classification and regression)
  • Boosting methods, including AdaBoost and XGBoost, are ensemble algorithms that create a series of models where each new model tries to correct errors from the previous model (for both classification and regression)

Where are the neural networks and deep neural networks that we hear so much about? They tend to be compute-intensive to the point of needing GPUs or other specialized hardware, so you should use them only for specialized problems, such as image classification and speech recognition, that aren’t well-suited to simpler algorithms. Note that “deep” means that there are many hidden layers in the neural network.

For more on neural networks and deep learning, see “What deep learning really means.”

Hyperparameters for machine learning algorithms

Machine learning algorithms train on data to find the best set of weights for each independent variable that affects the predicted value or class. The algorithms themselves have variables, called hyperparameters. They’re called hyperparameters, as opposed to parameters, because they control the operation of the algorithm rather than the weights being determined.

The most important hyperparameter is often the learning rate, which determines the step size used when finding the next set of weights to try when optimizing. If the learning rate is too high, the gradient descent may quickly converge on a plateau or suboptimal point. If the learning rate is too low, the gradient descent may stall and never completely converge.

Many other common hyperparameters depend on the algorithms used. Most algorithms have stopping parameters, such as the maximum number of epochs, or the maximum time to run, or the minimum improvement from epoch to epoch. Specific algorithms have hyperparameters that control the shape of their search. For example, a Random Forest Classifier has hyperparameters for minimum samples per leaf, max depth, minimum samples at a split, minimum weight fraction for a leaf, and about 8 more.

Hyperparameter tuning

Several production machine-learning platforms now offer automatic hyperparameter tuning. Essentially, you tell the system what hyperparameters you want to vary, and possibly what metric you want to optimize, and the system sweeps those hyperparameters across as many runs as you allow. (Google Cloud hyperparameter tuning extracts the appropriate metric from the TensorFlow model, so you don’t have to specify it.)

There are three search algorithms for sweeping hyperparameters: Bayesian optimization, grid search, and random search. Bayesian optimization tends to be the most efficient.

You would think that tuning as many hyperparameters as possible would give you the best answer. However, unless you are running on your own personal hardware, that could be very expensive. There are diminishing returns, in any case. With experience, you’ll discover which hyperparameters matter the most for your data and choice of algorithms.

Automated machine learning

Speaking of choosing algorithms, there is only one way to know which algorithm or ensemble of algorithms will give you the best model for your data, and that’s to try them all. If you also try all the possible normalizations and choices of features, you’re facing a combinatorial explosion.

Trying everything is impractical to do manually, so of course machine learning tool providers have put a lot of effort into releasing AutoML systems. The best ones combine feature engineering with sweeps over algorithms and normalizations. Hyperparameter tuning of the best model or models is often left for later. Feature engineering is a hard problem to automate, however, and not all AutoML systems handle it.

In summary, machine learning algorithms are just one piece of the machine learning puzzle. In addition to algorithm selection (manual or automatic), you’ll need to deal with optimizers, data cleaning, feature selection, feature normalization, and (optionally) hyperparameter tuning.

When you’ve handled all of that and built a model that works for your data, it will be time to deploy the model, and then update it as conditions change. Managing machine learning models in production is, however, a whole other can of worms.

Artificial Intelligence vs. Machine Learning vs. Deep Learning

Artificial Intelligence vs. Machine Learning vs. Deep Learning

Learn the Difference between the most popular Buzzwords in today's tech. World — AI, Machine Learning and Deep Learning

In this article, we are going to discuss we difference between Artificial Intelligence, Machine Learning, and Deep Learning.

Furthermore, we will address the question of why Deep Learning as a young emerging field is far superior to traditional Machine Learning.

Artificial Intelligence, Machine Learning, and Deep Learning are popular buzzwords that everyone seems to use nowadays.

But still, there is a big misconception among many people about the meaning of these terms.

In the worst case, one may think that these terms describe the same thing — which is simply false.

A large number of companies claim nowadays to incorporate some kind of “ Artificial Intelligence” (AI) in their applications or services.

But artificial intelligence is only a broader term that describes applications when a machine mimics “ cognitive “ functions that humans associate with other human minds, such as “learning” and “problem-solving”.

On a lower level, an AI can be only a programmed rule that determines the machine to behave in a certain way in certain situations. So basically Artificial Intelligence can be nothing more than just a bunch of if-else statements.

An if-else statement is a simple rule explicitly programmed by a human. Consider a very abstract, simple example of a robot who is moving on a road. A possible programmed rule for that robot could look as follows:

Instead, when speaking of Artificial Intelligence it's only worthwhile to consider two different approaches: Machine Learning and Deep Learning. Both are subfields of Artificial Intelligence

Machine Learning vs Deep Learning

Now that we now better understand what Artificial Intelligence means we can take a closer look at Machine Learning and Deep Learning and make a clearer distinguishment between these two.

Machine Learning incorporates “ classical” algorithms for various kinds of tasks such as clustering, regression or classification. Machine Learning algorithms must be trained on data. The more data you provide to your algorithm, the better it gets.

The “training” part of a Machine Learning model means that this model tries to optimize along a certain dimension. In other words, the Machine Learning models try to minimize the error between their predictions and the actual ground truth values.

For this we must define a so-called error function, also called a loss-function or an objective function … because after all the model has an objective. This objective could be for example classification of data into different categories (e.g. cat and dog pictures) or prediction of the expected price of a stock in the near future.

When someone says they are working with a machine-learning algorithm, you can get to the gist of its value by asking: What’s the objective function?

At this point, you may ask: How do we minimize the error?

One way would be to compare the prediction of the model with the ground truth value and adjust the parameters of the model in a way so that next time, the error between these two values is smaller. This is repeated again and again and again.

Thousands and millions of times, until the parameters of the model that determine the predictions are so good, that the difference between the predictions of the model and the ground truth labels are as small as possible.

In short machine learning models are optimization algorithms. If you tune them right, they minimize their error by guessing and guessing and guessing again.

Machine Learning is old…

Machine Learning is a pretty old field and incorporates methods and algorithms that have been around for dozens of years, some of them since as early as the sixties.

Some known methods of classification and prediction are the Naive Bayes Classifier and the Support Vector Machines. In addition to the classification, there are also clustering algorithms such as the well-known K-Means and tree-based clustering. To reduce the dimensionality of data to gain more insights about it’ nature methods such as Principal component analysis and tSNE are used.

Deep Learning — The next big Thing

Deep Learning, on the other hand, is a very young field of Artificial Intelligence that is powered by artificial neural networks.

It can be viewed again as a subfield of Machine Learning since Deep Learning algorithms also require data in order to learn to solve tasks. Although methods of Deep Learning are able to perform the same tasks as classic Machine Learning algorithms, it is not the other way round.

Artificial neural networks have unique capabilities that enable Deep Learning models to solve tasks that Machine Learning models could never solve.

All recent advances in intelligence are due to Deep Learning. Without Deep Learning we would not have self-driving cars, chatbots or personal assistants like Alexa and Siri. Google Translate app would remain primitive and Netflix would have no idea which movies or TV series we like or dislike.

We can even go so far as to say that the new industrial revolution is driven by artificial neural networks and Deep Learning. This is the best and closest approach to true machine intelligence we have so far. The reason is that Deep Learning has two major advantages over Machine Learning.

Why is Deep Learning better than Machine Learning?

The first advantage is the needlessness of Feature Extraction. What do I mean by this?

Well if you want to use a Machine Learning model to determine whether a given picture shows a car or not, we as humans, must first program the unique features of a car (shape, size, windows, wheels etc.) into the algorithm. This way the algorithm would know what to look after in the given pictures.

In the case of a Deep Learning model, is step is completely unnecessary. The model would recognize all the unique characteristics of a car by itself and make correct predictions.

In fact, the needlessness of feature extraction applies to any other task for a deep learning model. You simply give the neural network the raw data, the rest is done by the model. While for a machine learning model, you would need to perform additional steps, such as the already mentioned extraction of the features of the given data.

The second huge advantage of Deep Learning and a key part in understanding why it’s becoming so popular is that it’s powered by massive amounts of data. The “Big Data Era” of technology will provide huge amounts of opportunities for new innovations in deep learning. To quote Andrew Ng, the chief scientist of China’s major search engine Baidu and one of the leaders of the Google Brain Project:

The analogy to deep learning is that the rocket engine is the deep learning models and the fuel is the huge amounts of data we can feed to these algorithms.

Deep Learning models tend to increase their accuracy with the increasing amount of training data, where’s traditional machine learning models such as SVM and Naive Bayes classifier stop improving after a saturation point.

Artificial Intelligence vs. Machine Learning vs. Deep Learning

Artificial Intelligence vs. Machine Learning vs. Deep Learning

Artificial Intelligence vs. Machine Learning vs. Deep Learning. We are going to discuss we difference between Artificial Intelligence, Machine Learning, and Deep Learning

We are going to discuss we difference between Artificial Intelligence, Machine Learning, and Deep Learning

Furthermore, we will address the question of why Deep Learning as a young emerging field is far superior to traditional Machine Learning

Artificial Intelligence, Machine Learning, and Deep Learning are popular buzzwords that everyone seems to use nowadays.

But still, there is a big misconception among many people about the meaning of these terms.

In the worst case, one may think that these terms describe the same thing — which is simply false.

A large number of companies claim nowadays to incorporate some kind of “ Artificial Intelligence” (AI) in their applications or services.

But artificial intelligence is only a broader term that describes applications when a machine mimics “ cognitive “ functions that humans associate with other human minds, such as “learning” and “problem-solving”.

On a lower level, an AI can be only a programmed rule that determines the machine to behave in a certain way in certain situations. So basically Artificial Intelligence can be nothing more than just a bunch of if-else statements.

An if-else statement is a simple rule explicitly programmed by a human. Consider a very abstract, simple example of a robot who is moving on a road. A possible programmed rule for that robot could look as follows:

Instead, when speaking of Artificial Intelligence it's only worthwhile to consider two different approaches: Machine Learning and Deep Learning. Both are subfields of Artificial Intelligence

Machine Learning vs Deep Learning

Now that we now better understand what Artificial Intelligence means we can take a closer look at Machine Learning and Deep Learning and make a clearer distinguishment between these two.

Machine Learning incorporates “ classical” algorithms for various kinds of tasks such as clustering, regression or classification. Machine Learning algorithms must be trained on data. The more data you provide to your algorithm, the better it gets.

The “training” part of a Machine Learning model means that this model tries to optimize along a certain dimension. In other words, the Machine Learning models try to minimize the error between their predictions and the actual ground truth values.

For this we must define a so-called error function, also called a loss-function or an objective function … because after all the model has an objective. This objective could be for example classification of data into different categories (e.g. cat and dog pictures) or prediction of the expected price of a stock in the near future.

When someone says they are working with a machine-learning algorithm, you can get to the gist of its value by asking: What’s the objective function?

At this point, you may ask: How do we minimize the error?

One way would be to compare the prediction of the model with the ground truth value and adjust the parameters of the model in a way so that next time, the error between these two values is smaller. This is repeated again and again and again.

Thousands and millions of times, until the parameters of the model that determine the predictions are so good, that the difference between the predictions of the model and the ground truth labels are as small as possible.

In short machine learning models are optimization algorithms. If you tune them right, they minimize their error by guessing and guessing and guessing again.

Machine Learning is old…

The basic definition of machine learning is:

Algorithms that analyze data, learn from it and make informed decisions based on the learned insights.

Machine learning leads to a variety of automated tasks. It affects virtually every industry — from IT security malware search, to weather forecasting, to stockbrokers looking for cheap trades. Machine learning requires complex math and a lot of coding to finally get the desired functions and results.

Machine learning algorithms need to be trained on large amounts of data.
The more data you provide for your algorithm, the better it gets.

Machine Learning is a pretty old field and incorporates methods and algorithms that have been around for dozens of years, some of them since as early as the sixties.

These classic algorithms include algorithms such as the so-called Naive Bayes Classifier and the Support Vector Machines. Both are often used in the classification of data.

In addition to the classification, there are also cluster analysis algorithms such as the well-known K-Means and the tree-based clustering. To reduce the dimensionality of data and gain more insight into its nature, machine learning uses methods such as principal component analysis and tSNE.

Deep Learning — The next big Thing

Now let’s focus on the essential thing that is at stake here. On deep learning.
Deep Learning is a very young field of artificial intelligence based on artificial neural networks.

Again, deep learning can be seen as a part of machine learning because deep learning algorithms also need data to learn how to solve problems. Therefore, the terms of machine learning and deep learning are often treated as the same. However, these systems have different capabilities.

Deep Learning uses a multi-layered structure of algorithms called the neural network:

It can be viewed again as a subfield of Machine Learning since Deep Learning algorithms also require data in order to learn to solve tasks. Although methods of Deep Learning are able to perform the same tasks as classic Machine Learning algorithms, it is not the other way round.

Artificial neural networks have unique capabilities that enable Deep Learning models to solve tasks that Machine Learning models could never solve.

All recent advances in intelligence are due to Deep Learning. Without Deep Learning we would not have self-driving cars, chatbots or personal assistants like Alexa and Siri. Google Translate app would remain primitive and Netflix would have no idea which movies or TV series we like or dislike.

We can even go so far as to say that the new industrial revolution is driven by artificial neural networks and Deep Learning. This is the best and closest approach to true machine intelligence we have so far. The reason is that Deep Learning has two major advantages over Machine Learning.

Why is Deep Learning better than Machine Learning?

Feature Extraction

The first advantage of Deep Learning over machine learning is the needlessness of the so-called Feature Extraction.

Long before deep learning was used, traditional machine learning methods were popular, such as Decision Trees, SVM, Naïve Bayes Classifier and Logistic Regression. These algorithms are also called “flat algorithms”.

Flat means here that these algorithms can not normally be applied directly to the raw data (such as .csv, images, text, etc.). We require a preprocessing step called Feature Extraction.

The result of Feature Extraction is an abstract representation of the given raw data that can now be used by these classic machine learning algorithms to perform a task. For example, the classification of the data into several categories or classes.

Feature Extraction is usually pretty complicated and requires detailed knowledge of the problem domain. This step must be adapted, tested and refined over several iterations for optimal results.

On the other side are the artificial neural networks. These do not require the step of feature extraction. The layers are able to learn an implicit representation of the raw data directly on their own.

Here, a more and more abstract and compressed representation of the raw data is produced over several layers of an artificial neural network. This compressed representation of the input data is then used to produce the result. The result can be, for example, the classification of the input data into different classes.

In other words, we can also say that the feature extraction step is already a part of the process that takes place in an artificial neural network. During the training process, this step is also optimized by the neural network to obtain the best possible abstract representation of the input data. This means that the models of deep learning thus require little to no manual effort to perform and optimize the feature extraction process.

For example, if you want to use a machine learning model to determine whether a particular image shows a car or not, we humans first need to identify the unique features of a car (shape, size, windows, wheels, etc.), extract these features and give them to the algorithm as input data. This way, the machine learning algorithm would perform a classification of the image. That is, in machine learning, a programmer must intervene directly in the classification process.

In the case of a deep learning model, the feature extraction step is completely unnecessary. The model would recognize these unique characteristics of a car and make correct predictions- completely without the help of a human.

In fact, this applies to every other task you’ll ever do with neural networks.
They just give the raw data to the neural network, the rest is done by the model.

The Era of Big Data…

The second huge advantage of Deep Learning and a key part in understanding why it’s becoming so popular is that it’s powered by massive amounts of data. The “Big Data Era” of technology will provide huge amounts of opportunities for new innovations in deep learning. To quote Andrew Ng, the chief scientist of China’s major search engine Baidu and one of the leaders of the Google Brain Project:

The analogy to deep learning is that the rocket engine is the deep learning models and the fuel is the huge amounts of data we can feed to these algorithms.

Deep Learning models scale better with a larger amount of data

Deep Learning models tend to increase their accuracy with the increasing amount of training data, where’s traditional machine learning models such as SVM and Naive Bayes classifier stop improving after a saturation point.

Special Announcement: We just released a free Course on Deep Learning!

I am the founder of DeepLearning Academy, an advanced Deep Learning education platform. We provide practical state-of-the-art Deep Learning education and mentoring to professionals and beginners.

Among our things we just released a free Introductory Course on Deep Learning with TensorFlow, where you can learn how to implement Neural Networks from Scratch for various use-cases using TensorFlow.

If you are interested in this topic, feel free to check it out ;)

The Difference Between Artificial Intelligence, Machine Learning, and Deep Learning

The Difference Between Artificial Intelligence, Machine Learning, and Deep Learning

In this post, Simple explanations of Artificial Intelligence, Machine Learning, and Deep Learning and how they’re all different. Plus, how AI and IoT are inextricably connected.

In this post, Simple explanations of Artificial Intelligence, Machine Learning, and Deep Learning and how they’re all different. Plus, how AI and IoT are inextricably connected.

We’re all familiar with the term “Artificial Intelligence” After all, it’s been a popular focus in movies such as The Terminator, The Matrix, and Ex Machina (a personal favorite of mine). But you may have recently been hearing about other terms like “Machine Learning” and “Deep Learning” sometimes used interchangeably with artificial intelligence. As a result, the *difference *between **artificial intelligence, machine learning, **and deep learning can be very unclear.

I’ll begin by giving a quick explanation of what Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL) actually mean and how they’re different. Then, I’ll share how **AI **and the Internet of Things are inextricably intertwined, with several technological advances all converging at once to set the foundation for an AI and IoT explosion.

So what’s the difference between AI, ML, and DL?

First coined in 1956 by John McCarthy, AI involves machines that can perform tasks that are characteristic of human intelligence. While this is rather general, it includes things like planning, understanding language, recognizing objects and sounds, learning, and problem solving.

We can put AI in two categories, general and narrow. General AI would have all of the characteristics of human intelligence, including the capacities mentioned above. Narrow AI exhibits some facet(s) of human intelligence, and can do that facet extremely well, but is lacking in other areas. A machine that’s great at recognizing images, but nothing else, would be an example of narrow AI.

At its core, machine learning is simply a way of achieving AI.

Arthur Samuel coined the phrase not too long after AI, in 1959, defining it as, “the ability to learn without being explicitly programmed.” You see, you can get AI without using machine learning, but this would require building millions of lines of codes with complex rules and decision-trees.

So instead of hard coding software routines with specific instructions to accomplish a particular task, machine learning is a way of “training” an algorithm so that it can learnhow. “Training” involves feeding huge amounts of data to the algorithm and allowing the algorithm to adjust itself and improve.

To give an example, machine learning has been used to make drastic improvements to computer vision (the ability of a machine to recognize an object in an image or video). You gather hundreds of thousands or even millions of pictures and then have humans tag them. For example, the humans might tag pictures that have a cat in them versus those that do not. Then, the algorithm tries to build a model that can accurately tag a picture as containing a cat or not as well as a human. Once the accuracy level is high enough, the machine has now “learned” what a cat looks like.

Deep learning is one of many approaches to machine learning. Other approaches include decision tree learning, inductive logic programming, clustering, reinforcement learning, and Bayesian networks, among others.

Deep learning was inspired by the structure and function of the brain, namely the interconnecting of many neurons. Artificial Neural Networks (ANNs) are algorithms that mimic the biological structure of the brain.

In ANNs, there are “neurons” which have discrete layers and connections to other “neurons”. Each layer picks out a specific feature to learn, such as curves/edges in image recognition. It’s this layering that gives deep learning its name, depth is created by using multiple layers as opposed to a single layer.

AI and IoT are Inextricably Intertwined

I think of the relationship between AI and IoT much like the relationship between the human brain and body.

Our bodies collect sensory input such as sight, sound, and touch. Our brains take that data and makes sense of it, turning light into recognizable objects and turning sounds into understandable speech. Our brains then make decisions, sending signals back out to the body to command movements like picking up an object or speaking.

All of the connected sensors that make up the Internet of Things are like our bodies, they provide the raw data of what’s going on in the world. Artificial intelligence is like our brain, making sense of that data and deciding what actions to perform. And the connected devices of IoT are again like our bodies, carrying out physical actions or communicating to others.

Unleashing Each Other’s Potential

The value and the promises of both AI and IoT are being realized because of the other.

Machine learning and deep learning have led to huge leaps for AI in recent years. As mentioned above, machine learning and deep learning require massive amounts of data to work, and this data is being collected by the billions of sensors that are continuing to come online in the Internet of Things. IoT makes better AI.

Improving AI will also drive adoption of the Internet of Things, creating a virtuous cycle in which both areas will accelerate drastically. That’s because AI makes IoT useful.

On the industrial side, AI can be applied to predict when machines will need maintenance or analyze manufacturing processes to make big efficiency gains, saving millions of dollars.

On the consumer side, rather than having to adapt to technology, technology can adapt to us. Instead of clicking, typing, and searching, we can simply ask a machine for what we need. We might ask for information like the weather or for an action like preparing the house for bedtime (turning down the thermostat, locking the doors, turning off the lights, etc.).

Converging Technological Advancements Have Made this Possible

Shrinking computer chips and improved manufacturing techniques means cheaper, more powerful sensors.

Quickly improving battery technology means those sensors can last for years without needing to be connected to a power source.

Wireless connectivity, driven by the advent of smartphones, means that data can be sent in high volume at cheap rates, allowing all those sensors to send data to the cloud.

And the birth of the cloud has allowed for virtually unlimited storage of that data and virtually infinite computational ability to process it.

Of course, there are one or two concerns about the impact of AI on our society and our future. But as advancements and adoption of both AI and IoT continue to accelerate, one thing is certain; the impact is going to be profound.

*Originally published at *https://www.iotforall.com