Machine Learning vs Traditional Programming

Machine Learning vs Traditional Programming

Difference between Machine Learning and Traditional Programming

Difference between Machine Learning and Traditional Programming

While some call AI and ML overhyped themes that are nothing more than if statements, or just programming stuff, I offer you to come face to face with all pieces of evidence to check this out. In this post, I will contrast these terms and also showcase the difference between the specialists who are involved in these two spheres: who are they? Software engineer, software developer, machine learning expert, data scientist…some people even use a programmer or coder, and some even go as far as a ninja, guru, or rock star! But, are they really the same? And if so, is there a line between Machine Learning and Traditional Programming?

ML vs Programming: First, What’s Machine Learning?

It’s easy to say that AI and ML are nothing more than if statements. Or what is more, it is just simple statistics. What else do we hear about it? Is ML just a new word to describe math + algorithms? Sometimes such simplifications seem to be funny, but obviously, ML is more complicated.

But let’s have a look at a more appropriate explanation.

So, in simple words, Artificial Intelligence is an umbrella that contains other realms like image processing, cognitive science, neural networks and much more. Machine Learning is also a component located under this umbrella. Its core idea is that the computer does not just use a pre-written algorithm, but learns how to solve the problem itself. Or, to explain it in other words, there is an excellent definition by Arthur Samuel (who actually had coined the term of ML):

Machine Learning is a field of study that gives computers the ability to learn without being explicitly programmed.
So, yes, ML teach a machine to solve various complex tasks that are difficult to be solved algorithmically. What are those tasks? Well, you probably have already stumble at them in practice. For example, it can be face recognition on your phone or voice understanding, driving a car (Google Self-Driving Car), diagnose diseases by symptoms (Watson), advise products, books (Amazon), movies (Netflix), music (Spotify), perform the functions of a personal assistant (Siri, Cortana)…this list can go on and on.

Okay, I hope it was clear enough and now it’s time to move on onto another important thing about ML.

Any working ML-technology can be conditionally attributed to one of three levels of accessibility. What does this mean? Well, the first level is when it is available exceptionally to major tech-giants like Google or IBM. The second level is when, let’s say, a student with a certain amount of knowledge can use it. And the last one, the third level of ML accessibility is when even a granny is able to cope with it.

What we have on the current stage of development is Machine learning at the junction of the second and third levels. Due to this, the rate of change of the world with the help of this technology grows with cosmic speed.

Last but the least thing about ML: most of the ML-tasks can be divided into learning with a teacher (supervised learning) and learning without a teacher (unsupervised learning). And if you imagine a programmer with a whip in one hand and a piece of sugar in the other, you are a little bit mistaken.

The name “teacher” means the idea of human intervention in data processing. When training with a teacher, which is supervised learning, we have the data, and we need to predict something on its basis. On the other hand, when teaching without a teacher, which is unsupervised learning, we again have the data, but here we need to find its properties.

ML vs Programming: Okay, Then How it differs from Programming?

In traditional programming you hard code the behavior of the program. In machine learning, you leave a lot of that to the machine to learn from data.
Consequently, these terms are no way interchangeable: data engineer can’t replace the work of traditional programming and vice versa. Although every data engineer is obliged to use at least one coding language, traditional programming is only a small part of what he (or she?) does. On the other hand, we can’t say software developer is using ML-algorithms to launch a website.

ML just like AI is not a substitution, but supplementation for traditional programming approaches. For instance, ML can be used to build predictive algorithms for an online trading platform, while the platform’s UI, data visualization and other elements will be performed in a mainstream programming language such as Ruby, or Java.

So, here is the main thing: ML is used in the case when traditional programming strategy falls behind and it is not enough to fully implement a certain task.

What does this mean in practice? Here is a great explanation on the basis of a classical ML-problem of exchange rate forecasting and two different ways to do it:

Traditional programming approach

For any solution, the first task is the creation of the most suitable algorithm and writing the code. Thereafter, it is mandatory to set the input parameters and, in fact, if an implemented algorithm is ok it will produce the expected result.

How a software developer creates a solution

However, when we need to predict something, we need to use an algorithm with a variety of input parameters. In case of prediction of the exchange rate, it’s mandatory to add such details like yesterday’s rate; external and internal economic changes in the country that issues the currency and more.

Consequently, we handcraft a solution that is able to accept a set of parameters and, based on the input data, predict a new exchange rate.

But it’s extremely important to add one more thing or to be more clear one problem of such an approach. So what is it?

Well, it’s simple, we need to add a thousand and hundreds of parameters, whereas their limited set allows building a very basic and unscalable model. So yes, for any person is troublesome to work with such massive data arrays.

Then we have a slightly different Machine learning approach for this task, so what is it?

To solve the same problem using ML-methods, data engineers use a totally different procedure. Instead of developing an algorithm on its own, they need to collect an array of historical data that will be used for semi-automatic model building.

Following managing a satisfactory set of data, the data engineer loads it into already tailored ML-algorithms. The result is a model that can predict a new result, receiving new data as input.

How a data engineer develops a solution using machine learning

A distinctive feature of ML is there is no need to build a model. This complicated yet meaningful responsibility is executed by ML-algorithms. And ML expert will only add just a minor edit to this.

Another significant difference between ML and Programming is determined by the number of input parameters that the model is capable of processing. For an accurate prediction, you have to add thousands of parameters and do it with high accuracy, as every bit will affect the final result. A human being a priori cannot build an algorithm that will use all of those details in a reasonable way.

However, for ML, there are no such restrictions. As long as you have enough processor power and memory, you can use as many input parameters as you see fit. Undoubtedly, this fact makes ML be so powerful and widespread nowadays.

Summing it up: ML expert, Data scientist, programmer, and software engineer…who is who?

According to Wiki,** Data Science*** is a multi-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge and insights from structured and unstructured data.*
So far it does not sound very cool.

But then there is something interesting:

use the most powerful hardware, the most powerful programming systems, and the most efficient algorithms to solve problems.
And then even more interesting part:
In 2012, Harvard Business Review called it The Sexiest Job of the 21st Century.
So, Data Science is another extensive umbrella, just like Computer Science, only Data Science aimed at processing data and extracting useful information from them.

What about programming? Data scientists nowadays do it exceptionally in the interest of research. They are not only programmers, but they are also usually supposed to have an applied statistics or research background. Some also do software engineering, especially at companies serving data science/ML in their products. The most interesting thing is that Data Scientist is not obliged to be able to program well, but can be limited to tools like Matlab, SPSS, SAS, etc.

What then is the position of Machine Learning Engineer?

The position of the Machine Learning Engineer is more “technical”. In other words, ML Engineer has more in common with classic Software Engineering than Data Scientist.

The standard tasks of ML Engineer are generally similar to Data Scientist. You also need to be able to work with data, experiment with different machine learning algorithms that will solve the problem, create prototypes and ready-made solutions.

Of the key differences, I would highlight:

  • Strong programming skills in one or more languages (usually Python);
  • Less emphasis on the ability to work in data analysis environments, but more emphasis on Machine Learning algorithms;
  • The ability to use in the application ready-made libraries for different stacks, for example, NumPy / SciPy for Python;
  • The ability to create distributed applications using Hadoop and more.

And now let’s move back to the programming and have a closer look at what tasks are assigned to the programmer.

A programmer is actually someone like a data analyst or business systems developer. They don’t have to build systems themselves, they just write a loosely structured code against existing systems. So, yes, we can call data science a new wave of programming, but coding is only a small part of it. So, don’t be mistaken.

But if digging deeper, we will found out there are other terms like **Software Engineer **and Software Developer, and both of them are also not similar. For example, software engineers have to engineer things. They deal with production applications, distributed systems, concurrency, build systems, microservices. And just for the record, a software developer needs to understand all the cycles of software development, not just implementation (which sometimes won’t even need any programming or coding).

So, programming and machine learning…do you feel the difference now? I hope this post helped you to avoid confusion around those terms. Undoubtedly, all of them have something in common which is technology, but the number of their differences is much bigger. Thus, ml-engineer, software engineer and software developer are completely not interchangeable.

Data Science Vs Machine Learning Vs Artificial Intelligence

Data Science Vs Machine Learning Vs Artificial Intelligence

Data Science vs Artificial Intelligence vs Machine Learning vs Deep Learning - Learn about each concept and relation between them for their ...

Data Science vs Artificial Intelligence vs Machine Learning vs Deep Learning - Learn about each concept and relation between them for their ...

What is Data Science?

Data Science is an interdisciplinary field whose primary objective is the extraction of meaningful knowledge and insights from data. These insights are extracted with the help of various mathematical and Machine Learning-based algorithms. Hence, Machine Learning is a key element of Data Science.

Alongside Machine Learning, as the name suggests, “data” itself is the fuel for Data Science. Without the availability of appropriate data, key insights cannot be extracted from it. Both the volume and accuracy of data matters in this field, since the algorithms are designed to “learn” with “experience”, which comes through the data provided. Data Science involves the use of various types of data, from multiple sources. Some of the types of data are image data, text data, video data, time-dependent data, time-independent data, audio data, etc.

Data Science requires knowledge of multiple disciplines. As shown in the figure, it is a combination of Mathematics and Statistics, Computer Science skills and Domain Specific Knowledge. Without a mastery of all these sub-domains, the grasp on Data Science will be incomplete.

What is Machine Learning?

Machine Learning is a subset or a part of Artificial Intelligence. It primarily involves the scientific study of algorithmic, mathematical, and statistical models which performs a specific task by analyzing data, without any explicit step-by-step instructions, by relying on patterns and inference, which is drawn from the data. This also contributes to its alias, Pattern Recognition.

Its objective is to recognize patterns in a given data and draw inferences, which allows it to perform a similar task on similar but unseen data. These two separate sets of data are known as the “Training Set” and “Testing Set” respectively.

Machine Learning primarily finds its applications in solving complex problems, which, a normal procedure oriented program cannot solve, or in places where there are too many variables that need to be explicitly programmed, which is not feasible.

As shown in the figure, Machine Learning is primarily of three types, namely: Supervised Learning, Unsupervised Learning and Reinforcement Learning.

  • Supervised Learning: This is the most commonly used form of machine learning and is widely used across the industry. In fact, most of the problems that are solved by Machine Learning belong to Supervised Learning. A learning problem is known as supervised learning when the data is in the form of feature-label pairs. In other words, the algorithm is trained on data where the ground truth is known. This is learning with a teacher. Two common types of supervised learning are:
    Classification: This is a process where the dataset is categorized into discrete values or categories. For example, if the input to the algorithm is an image of a dog or a cat, ideally, a well-trained algorithm should be able to predict whether the input image is that of a dog or of a cat.Regression: This is a process where the dataset has continuous valued target values. That is, the output of the function is not categories, but is a continuous value. For example, algorithms that forecast the future price of the stock market would output a continuous value (like 34.84, etc.) for a given set of inputs. * Unsupervised Learning: This is a much lesser used, but quite important learning technique. This technique is primarily used when there is unlabeled data or data without the target values mentioned. In such learning, the algorithm has to analyze the data itself and bring out insights based on certain common traits or features in the dataset. This is learning without a teacher. Two common types of unsupervised learning are:
    Clustering: Clustering is a well known unsupervised learning technique where similar data are automatically grouped together by the algorithm based on common features or traits (eg. color, values, similarity, difference, etc.).Dimensionality Reduction: Yet another popular unsupervised learning is dimensionality reduction. The dataset that is used for machine learning is often huge and of high dimensions (higher than three dimensions). One major problem in working with high dimensional data is data-visualization. Since we can visualize and understand up-to 3 dimensions, higher dimensional data is often difficult for human beings to interpret. In addition to this, higher dimension means more features, which in turn means a more complex model, which is often a curse for any machine learning model. The aim is to keep the simplest model that works best on a wide range of unseen data. Hence, dimensionality reduction is an important part of working with high dimensional data. One of the most common methods of dimensionality reduction is Principal Component Analysis (PCA).* Reinforcement Learning: This is a completely different approach to “learning” when compared to the previous two categories. This particular class of learning algorithms primarily finds its applications in Game AI, Robotics and Automatic Trading Bots. Here, the machine is not provided with a huge amount of data. Instead, in a given scenario (playground) some parameters and constrictions are defined and the algorithm is let loose. The only feedback given to the algorithm is that, if it wins or performs a correct task, it is rewarded. If it loses or performs an incorrect task, it is penalized. Based on this minimal feedback, over time the algorithm learns to how to do the correct task on its own.
What is Artificial Intelligence?

Artificial Intelligence is a vast field made up of multidisciplinary subjects, which aims to artificially create “intelligence” to machines, similar to that displayed by humans and animals. The term is used to describe machines that mimic cognitive functions such as learning and problem-solving.

Artificial Intelligence can be broadly classified into three parts: Analytical AI, Human-Inspired AI, and Humanized AI.

  1. Analytical AI: It only has characteristics which are consistent with Cognitive Intelligence. It generates a cognitive representation of the world around it based on past experiences, which inspires future decisions.
  2. Human-Inspired AI: In addition to having Cognitive Intelligence, this class of AI also has Emotional Intelligence. It has a deeper understanding of human emotions in addition to Cognitive Intelligence and thus has a better understanding of the world around it. Both Cognitive Intelligence and Emotional Intelligence contributes to the decision making of Human-Inspired AI.
  3. Humanized AI: This is the most superior form of AI among the three. This form of AI incorporates Cognitive Intelligence, Emotional Intelligence, and Social Intelligence into its decision making. With a broader understanding of the world around it, this form of AI is able to make self-conscious and self-aware decisions and interactions with the external world.
How are they interrelated?

From the above introductions, it may seem that these fields are not related to each other. However, that is not the case. Each of these three fields is quite closely related to each other than it may seem.

If we look at Venn Diagrams, Artificial Intelligence, Machine Learning and Data Science are overlapping sets, with Machine Learning being a subset or a part of Artificial Intelligence, and Data Science having a significant chunk of it under Artificial Intelligence and Machine Learning.

Artificial Intelligence is a much broader field and it incorporates most of the other intelligence-related fields of study. Machine Learning, being a part of AI, deals with the algorithmic learning and inference based on data, and finally, Data Science is primarily based on statistics, probability theory, and has significant contribution of Machine Learning to it; of course, AI also being a part of it, since Machine Learning is indeed a subset of Artificial Intelligence.

Similarities: All of the three fields have one thing in common, Machine Learning. Each of these is heavily dependent on Machine Learning Algorithms.

In Data Science, the statistical algorithms that are used are limited to certain applications. In most cases, Data Scientists rely on Machine Learning techniques to extract inferences from data.

The current technological advancement in Artificial Intelligence is heavily based on Machine Learning. The part of AI without Machine Learning is like a car without an engine. However, without the “learning” part, Artificial Intelligence is basically Expert Systems, Search and Optimization algorithms.

Difference between the three

Even though they are significantly similar to each other, there are still a few key differences that are to be noted.

Applications

Since all the three domains are interrelated, they have some common applications and some unique to each of them. Most applications involve the use of Machine Learning in some form or the other. Even then, there are certain applications of each domain, which are unique. A few of them are listed below:

  • Data Science: The applications in this domain are dependent on machine learning and mathematical algorithms, such as statistics and probability based algorithms.
    Time Series Forecasting: This is a very important application of data science and is used across the industry, primarily in the banking sector and the stock market sector. Even though there are Machine Learning based algorithms for this specific application, Data Scientists usually prefer the statistical approach.Recommendation Engines: This is a statistics-based approach towards recommending products or services to the user, based on data of his/her previous interests. Similar to the previous application, Machine Learning based algorithms to achieve similar or better results is also present.* Machine Learning: The applications of this domain is nearly limitless. Every industry has some problem that can partially or fully be solved by Machine Learning techniques. Even Data Science and Artificial Intelligence roles make use of Machine Learning to solve a huge set of problems.
    Computer Vision: This is another sub-field which falls under Machine Learning and deals with visual information. This field itself finds its applications in many industries, for example, Autonomous Driving Vehicles, Medical Imaging, Autonomous Surveillance Systems, etc.Natural Language Processing: Similar to the previous example, this field is also self-contained sub-field of research. Natural Language Processing (NLP) or Natural Language Understanding (NLU) primarily deals with the interpretation and understanding of the meaning behind spoken or written text/language. Understanding the exact meaning of a sentence is quite difficult (even for human beings). Teaching a machine to understand the meaning behind a text is even more challenging. Few of the major applications of this sub-field are the development of intelligent chatbots, artificial voice assistants (Google Assistant, Siri, Alexa, etc.), spam detection, hate speech detection and so on.* Artificial Intelligence: Most of the current advancements and applications in this domain is based on a sub-field of Machine Learning, known as Deep Learning. Deep Learning deals with artificially emulating the structure and function of the biological neuron. However, since few of the applications of Deep Learning have already been discussed under Machine Learning, let us look at applications of Artificial Intelligence that is not primarily dependent on Machine Learning.
    Game AI: Game AI is an interesting application of Artificial Intelligence, where the machine automatically learns to play complex games to the level where it can challenge and even win against a human being. Google’s DeepMind had developed a Game AI called AlphaGo, which outperformed and beat the human world champion in 2017. Similarly, video game AI’s have been developed to play Dota 2, flappy bird and Mario. These models are developed using several algorithms like Search and Optimization, Generative Models, Reinforcement Learning, etc.Search: Artificial Intelligence has found several applications in Search Engines, for example, Google and Bing Search. The method of displaying results and the order in which results are displayed are based on algorithms developed in the field of Artificial Intelligence. These applications do contain Machine Learning techniques, but their older versions were developed by algorithms like Google’s proprietary PageRank Algorithm, which were not based on “Learning”.Robotics: One of the major applications of Artificial Intelligence is in the field of robotics. Teaching robots to walk/run automatically (for example, Spot and Atlas) using Reinforcement Learning has been one of the biggest goals of companies like Boston Dynamics. In addition to that, humanoid robots like Sophia are a perfect example of AI being applied for Humanized AI.## Skill-set Required

Since the fields are interrelated by a significant degree, the skill-set required to master each of these fields is nearly the same and overlapping. However, there are a few skill-sets that are uniquely associated with each of them. The same has been discussed further.

  • Mathematics: Each of these fields is math heavy, which means mathematics are the basic building blocks of these fields and in order to fully understand the algorithms and master them, a great math background is necessary. However, all the fields of math are not necessary for all of these. The specific fields of math that are required are discussed below:
    Linear Algebra: Since all of these fields are based on data, which comes in huge volumes of rows and columns, matrices are the easiest and most convenient method of representing and manipulating such data. Hence, a thorough knowledge of Linear Algebra and Matrix operations is necessary for all three fields.Calculus: Deep Learning, the sub-field of Machine Learning is heavily dependent on calculus. To be more precise, multivariate derivatives. In neural networks, backpropagation algorithms require multiple derivative calculations, which demands a thorough knowledge of calculus.Statistics: Since these fields deal with a huge amount of data, the knowledge of statistics is imperative. Statistical methods to deal with the selection and testing of smaller sample size with diversity is the common application for all three fields. However, statistics finds its main application in Data Science, where most of the algorithms are purely based on statistics (eg. ARIMA algorithm used for Time Series Analysis).Probability: Similar to the reason behind statistics, probability and the conditional probability of a certain event is the basic building block of important Machine Learning algorithms like Naive Bayes Classifier. Probability theory is also very important in understanding Data Science Algorithms.* Computer Science: There is no doubt about either of these fields being a part of the Computer Science field. Hence, a thorough knowledge of computer science algorithms is quite necessary.
    Search and Optimization Algorithms: Fundamental Search Algorithms like Breadth-First Search (BFS), Depth-First Search (DFS), Bidirectional Search, Route Optimization Algorithms, etc. are quite important. These search and optimization algorithms find their use in the Artificial Intelligence field.Fuzzy Logic: Fuzzy Logic (FL) is a method of reasoning that resembles human reasoning. It imitates the way human beings make decisions. For example, making a YES or NO decision based on a certain set of events or environmental conditions. Fuzzy Logic is primarily used in Artificially Intelligent Systems.Basic Algorithms and Optimization: Even though this is not a necessity, but it is a good-to-have knowledge since fundamental knowledge on algorithms (searching, sorting, recursion, etc.) and optimization (space and time complexity) is necessary for any computer science related fields.* Programming Knowledge: The implementation of any of the algorithms in these fields is through programming. Hence a thorough knowledge of programming is a necessity. Some of the most commonly used programming languages are discussed further.
    Python: One of the most commonly used programming languages for either of these fields is Python. It is used across the industry and has support for a plethora of open source libraries for Machine Learning, Deep Learning, Artificial Intelligence, and Data Science. However, programming is not just about writing code, it is about writing proper Pythonic code. This has been discussed in detail in this article: A Guide to Best Python Practices.R: This is the second most used programming language for such applications across the industry. R excels in statistical libraries and data visualization when compared to python. However, lacks significantly when it comes to Deep Learning libraries. Hence, R is a preferred tool for Data Scientists.## Job Market

The Job Market for each of these fields is in very high demand. As a direct quote from Andrew Ng says, “AI is the new Electricity”. This is quite true as the extended field of Artificial Intelligence is at the verge of revolutionizing every industry in ways that could not be anticipated earlier.

Hence, the demand for jobs in the field of Data Science and Machine Learning is quite high. There are more job openings worldwide than the number of qualified Engineers who are eligible to fill that position. Hence, due to supply-demand constraints, the amount of compensation offered by companies for such roles exceeds any other domain.

The job scenario for each of the different domains are discussed further:

  1. Data Science: The number of job posting with the profile of Data Science is highest, among the three discussed domains. Data Scientists are handsomely paid for their work. Due to the blurred lines in terms of the difference between the fields, the job description of a Data Scientist ranges from Time Series Forecasting to Computer Vision. It basically covers the entire domain. For further insights on the job aspect of Data Science, the article on What is Data Science can be referred to.
  2. Machine Learning: Even though the number of jobs postings having the job profile as “Machine Learning Engineer” is much lesser when compared to that of a Data Scientist, it is still a significant field to consider when it comes to availability of jobs. Moreover, someone who is skilled in Machine Learning is a good candidate to consider for a Data Science role. However, unlike Data Science, Machine Learning job descriptions primarily deal with the requirements of “Learning” algorithms (including Deep Learning), and the industry ranges from Natural Language Processing to developing Recommendation Engines.
  3. Artificial Intelligence: Coming across job postings with profiles of “Artificial Intelligence Developer” developer is quite rare. Instead of “Artificial Intelligence”, most companies write “Data Scientists” or “Machine/Deep Learning Engineers” in the job profile. However, Artificial Intelligence Developers, in addition to getting jobs in the Machine Learning domain, mostly find jobs in Robotics and AI R&D oriented companies like Boston Dynamics, DeepMind, OpenAI, etc.

Conclusion

Data Science, Machine Learning and Artificial Intelligence are like the different branches of the same tree. They are highly overlapping and there is no clear boundary amongst them. They have common skill set requirements and common applications as well. They are just different names given to slightly different versions of AI.

Finally, it is worth mentioning that since there is high overlap in required skill-set, an optimally skilled Engineer is eligible to work in either of the three domains and switch domains without any major changes.

Artificial Intelligence vs. Machine Learning vs. Deep Learning

Artificial Intelligence vs. Machine Learning vs. Deep Learning

Learn the Difference between the most popular Buzzwords in today's tech. World — AI, Machine Learning and Deep Learning

In this article, we are going to discuss we difference between Artificial Intelligence, Machine Learning, and Deep Learning.

Furthermore, we will address the question of why Deep Learning as a young emerging field is far superior to traditional Machine Learning.

Artificial Intelligence, Machine Learning, and Deep Learning are popular buzzwords that everyone seems to use nowadays.

But still, there is a big misconception among many people about the meaning of these terms.

In the worst case, one may think that these terms describe the same thing — which is simply false.

A large number of companies claim nowadays to incorporate some kind of “ Artificial Intelligence” (AI) in their applications or services.

But artificial intelligence is only a broader term that describes applications when a machine mimics “ cognitive “ functions that humans associate with other human minds, such as “learning” and “problem-solving”.

On a lower level, an AI can be only a programmed rule that determines the machine to behave in a certain way in certain situations. So basically Artificial Intelligence can be nothing more than just a bunch of if-else statements.

An if-else statement is a simple rule explicitly programmed by a human. Consider a very abstract, simple example of a robot who is moving on a road. A possible programmed rule for that robot could look as follows:

Instead, when speaking of Artificial Intelligence it's only worthwhile to consider two different approaches: Machine Learning and Deep Learning. Both are subfields of Artificial Intelligence

Machine Learning vs Deep Learning

Now that we now better understand what Artificial Intelligence means we can take a closer look at Machine Learning and Deep Learning and make a clearer distinguishment between these two.

Machine Learning incorporates “ classical” algorithms for various kinds of tasks such as clustering, regression or classification. Machine Learning algorithms must be trained on data. The more data you provide to your algorithm, the better it gets.

The “training” part of a Machine Learning model means that this model tries to optimize along a certain dimension. In other words, the Machine Learning models try to minimize the error between their predictions and the actual ground truth values.

For this we must define a so-called error function, also called a loss-function or an objective function … because after all the model has an objective. This objective could be for example classification of data into different categories (e.g. cat and dog pictures) or prediction of the expected price of a stock in the near future.

When someone says they are working with a machine-learning algorithm, you can get to the gist of its value by asking: What’s the objective function?

At this point, you may ask: How do we minimize the error?

One way would be to compare the prediction of the model with the ground truth value and adjust the parameters of the model in a way so that next time, the error between these two values is smaller. This is repeated again and again and again.

Thousands and millions of times, until the parameters of the model that determine the predictions are so good, that the difference between the predictions of the model and the ground truth labels are as small as possible.

In short machine learning models are optimization algorithms. If you tune them right, they minimize their error by guessing and guessing and guessing again.

Machine Learning is old…

Machine Learning is a pretty old field and incorporates methods and algorithms that have been around for dozens of years, some of them since as early as the sixties.

Some known methods of classification and prediction are the Naive Bayes Classifier and the Support Vector Machines. In addition to the classification, there are also clustering algorithms such as the well-known K-Means and tree-based clustering. To reduce the dimensionality of data to gain more insights about it’ nature methods such as Principal component analysis and tSNE are used.

Deep Learning — The next big Thing

Deep Learning, on the other hand, is a very young field of Artificial Intelligence that is powered by artificial neural networks.

It can be viewed again as a subfield of Machine Learning since Deep Learning algorithms also require data in order to learn to solve tasks. Although methods of Deep Learning are able to perform the same tasks as classic Machine Learning algorithms, it is not the other way round.

Artificial neural networks have unique capabilities that enable Deep Learning models to solve tasks that Machine Learning models could never solve.

All recent advances in intelligence are due to Deep Learning. Without Deep Learning we would not have self-driving cars, chatbots or personal assistants like Alexa and Siri. Google Translate app would remain primitive and Netflix would have no idea which movies or TV series we like or dislike.

We can even go so far as to say that the new industrial revolution is driven by artificial neural networks and Deep Learning. This is the best and closest approach to true machine intelligence we have so far. The reason is that Deep Learning has two major advantages over Machine Learning.

Why is Deep Learning better than Machine Learning?

The first advantage is the needlessness of Feature Extraction. What do I mean by this?

Well if you want to use a Machine Learning model to determine whether a given picture shows a car or not, we as humans, must first program the unique features of a car (shape, size, windows, wheels etc.) into the algorithm. This way the algorithm would know what to look after in the given pictures.

In the case of a Deep Learning model, is step is completely unnecessary. The model would recognize all the unique characteristics of a car by itself and make correct predictions.

In fact, the needlessness of feature extraction applies to any other task for a deep learning model. You simply give the neural network the raw data, the rest is done by the model. While for a machine learning model, you would need to perform additional steps, such as the already mentioned extraction of the features of the given data.

The second huge advantage of Deep Learning and a key part in understanding why it’s becoming so popular is that it’s powered by massive amounts of data. The “Big Data Era” of technology will provide huge amounts of opportunities for new innovations in deep learning. To quote Andrew Ng, the chief scientist of China’s major search engine Baidu and one of the leaders of the Google Brain Project:

The analogy to deep learning is that the rocket engine is the deep learning models and the fuel is the huge amounts of data we can feed to these algorithms.

Deep Learning models tend to increase their accuracy with the increasing amount of training data, where’s traditional machine learning models such as SVM and Naive Bayes classifier stop improving after a saturation point.

Data Science vs Data Analytics vs Big Data

Data Science vs Data Analytics vs Big Data

When we talk about data processing, Data Science vs Big Data vs Data Analytics are the terms that one might think of and there has always been a confusion between them. In this article on Data science vs Big Data vs Data Analytics, I will understand the similarities and differences between them

When we talk about data processing, Data Science vs Big Data vs Data Analytics are the terms that one might think of and there has always been a confusion between them. In this article on Data science vs Big Data vs Data Analytics, I will understand the similarities and differences between them

We live in a data-driven world. In fact, the amount of digital data that exists is growing at a rapid rate, doubling every two years, and changing the way we live. Now that Hadoop and other frameworks have resolved the problem of storage, the main focus on data has shifted to processing this huge amount of data. When we talk about data processing, Data Science vs Big Data vs Data Analytics are the terms that one might think of and there has always been a confusion between them.

In this article on Data Science vs Data Analytics vs Big Data, I will be covering the following topics in order to make you understand the similarities and differences between them.
Introduction to Data Science, Big Data & Data AnalyticsWhat does Data Scientist, Big Data Professional & Data Analyst do?Skill-set required to become Data Scientist, Big Data Professional & Data AnalystWhat is a Salary Prospect?Real time Use-case## Introduction to Data Science, Big Data, & Data Analytics

Let’s begin by understanding the terms Data Science vs Big Data vs Data Analytics.

What Is Data Science?

Data Science is a blend of various tools, algorithms, and machine learning principles with the goal to discover hidden patterns from the raw data.

[Source: gfycat.com]

It also involves solving a problem in various ways to arrive at the solution and on the other hand, it involves to design and construct new processes for data modeling and production using various prototypes, algorithms, predictive models, and custom analysis.

What is Big Data?

Big Data refers to the large amounts of data which is pouring in from various data sources and has different formats. It is something that can be used to analyze the insights which can lead to better decisions and strategic business moves.

[Source: gfycat.com]

What is Data Analytics?

Data Analytics is the science of examining raw data with the purpose of drawing conclusions about that information. It is all about discovering useful information from the data to support decision-making. This process involves inspecting, cleansing, transforming & modeling data.

[Source: ibm.com]

What Does Data Scientist, Big Data Professional & Data Analyst Do?

What does a Data Scientist do?

Data Scientists perform an exploratory analysis to discover insights from the data. They also use various advanced machine learning algorithms to identify the occurrence of a particular event in the future. This involves identifying hidden patterns, unknown correlations, market trends and other useful business information.

Roles of Data Scientist

What do Big Data Professionals do?

The responsibilities of big data professional lies around dealing with huge amount of heterogeneous data, which is gathered from various sources coming in at a high velocity.

Roles of Big Data Professiona

Big data professionals describe the structure and behavior of a big data solution and how it can be delivered using big data technologies such as Hadoop, Spark, Kafka etc. based on requirements.

What does a Data Analyst do?

Data analysts translate numbers into plain English. Every business collects data, like sales figures, market research, logistics, or transportation costs. A data analyst’s job is to take that data and use it to help companies to make better business decisions.

Roles of Data Analyst

Skill-Set Required To Become Data Scientist, Big Data Professional, & Data Analyst

What Is The Salary Prospect?

The below figure shows the average salary structure of **Data Scientist, Big Data Specialist, **and Data Analyst.

A Scenario Illustrating The Use Of Data Science vs Big Data vs Data Analytics.

Now, let’s try to understand how can we garner benefits by combining all three of them together.

Let’s take an example of Netflix and see how they join forces in achieving the goal.

First, let’s understand the role of* Big Data Professional* in Netflix example.

Netflix generates a huge amount of unstructured data in forms of text, audio, video files and many more. If we try to process this dark (unstructured) data using the traditional approach, it becomes a complicated task.

Approach in Netflix

Traditional Data Processing

Hence a Big Data Professional designs and creates an environment using Big Data tools to ease the processing of Netflix Data.

Big Data approach to process Netflix data

Now, let’s see how Data Scientist Optimizes the Netflix Streaming experience.

Role of Data Scientist in Optimizing the Netflix streaming experience

1. Understanding the impact of QoE on user behavior

User behavior refers to the way how a user interacts with the Netflix service, and data scientists use the data to both understand and predict behavior. For example, how would a change to the Netflix product affect the number of hours that members watch? To improve the streaming experience, Data Scientists look at QoE metrics that are likely to have an impact on user behavior. One metric of interest is the rebuffer rate, which is a measure of how often playback is temporarily interrupted. Another metric is bitrate, that refers to the quality of the picture that is served/seen — a very low bitrate corresponds to a fuzzy picture.

2. Improving the streaming experience

How do Data Scientists use data to provide the best user experience once a member hits “play” on Netflix?

One approach is to look at the algorithms that run in real-time or near real-time once playback has started, which determine what bitrate should be served, what server to download that content from, etc.

For example, a member with a high-bandwidth connection on a home network could have very different expectations and experience compared to a member with low bandwidth on a mobile device on a cellular network.

By determining all these factors one can improve the streaming experience.

3. Optimize content caching

A set of big data problems also exists on the content delivery side.

The key idea here is to locate the content closer (in terms of network hops) to Netflix members to provide a great experience. By viewing the behavior of the members being served and the experience, one can optimize the decisions around content caching.

4. Improving content quality

Another approach to improving user experience involves looking at the quality of content, i.e. the video, audio, subtitles, closed captions, etc. that are part of the movie or show. Netflix receives content from the studios in the form of digital assets that are then encoded and quality checked before they go live on the content servers.

In addition to the internal quality checks, Data scientists also receive feedback from our members when they discover issues while viewing.

By combining member feedback with intrinsic factors related to viewing behavior, they build the models to predict whether a particular piece of content has a quality issue. Machine learning models along with natural language processing (NLP) and text mining techniques can be used to build powerful models to both improve the quality of content that goes live and also use the information provided by the Netflix users to close the loop on quality and replace content that does not meet the expectations of the users.

So this is how Data Scientist optimizes the Netflix streaming experience.

Now let’s understand how Data Analytics is used to drive the Netflix success.

Role of Data Analyst in Netflix

The above figure shows the different types of users who watch the video/play on Netflix. Each of them has their own choices and preferences.

So what does a Data Analyst do?

Data Analyst creates a user stream based on the preferences of users. For example, if user 1 and user 2 have the same preference or a choice of video, then data analyst creates a user stream for those choices. And also –
Orders the Netflix collection for each member profile in a personalized way.We know that the same genre row for each member has an entirely different selection of videos.Picks out the top personalized recommendations from the entire catalog, focusing on the titles that are top on ranking.By capturing all events and user activities on Netflix, data analyst pops out the trending video.Sorts the recently watched titles and estimates whether the member will continue to watch or rewatch or stop watching etc.
I hope you have *understood *the *differences *& *similarities *between Data Science vs Big Data vs Data Analytics.