“Node is Not Recognized As an internal Or External Command” Error

Node.js is a powerful run-time environment based on Google’s v8 engine that is responsible for making fast and scalable web applications like streaming, chat apps, browser games, command-line applications, and much more. However, as a programmer, you cannot escape from errors. One such error based on Node.js is shown in the below screenshot

You can see we are encountering node is not recognized as an internal or external command error.

#node 

What is GEEK

Buddha Community

“Node is Not Recognized As an internal Or External Command” Error
 Ryleigh Walker

Ryleigh Walker

1593634380

'Node' Is Not Recognized As An Internal Or External Command, Operable Program

See the “‘node’ is not recognized as an internal or external command” message? The Solution is simple! Just click to read how.

#node #recognized #internal #external #command #operable

“Node is Not Recognized As an internal Or External Command” Error

Node.js is a powerful run-time environment based on Google’s v8 engine that is responsible for making fast and scalable web applications like streaming, chat apps, browser games, command-line applications, and much more. However, as a programmer, you cannot escape from errors. One such error based on Node.js is shown in the below screenshot

You can see we are encountering node is not recognized as an internal or external command error.

#node 

Hire Dedicated Node.js Developers - Hire Node.js Developers

If you look at the backend technology used by today’s most popular apps there is one thing you would find common among them and that is the use of NodeJS Framework. Yes, the NodeJS framework is that effective and successful.

If you wish to have a strong backend for efficient app performance then have NodeJS at the backend.

WebClues Infotech offers different levels of experienced and expert professionals for your app development needs. So hire a dedicated NodeJS developer from WebClues Infotech with your experience requirement and expertise.

So what are you waiting for? Get your app developed with strong performance parameters from WebClues Infotech

For inquiry click here: https://www.webcluesinfotech.com/hire-nodejs-developer/

Book Free Interview: https://bit.ly/3dDShFg

#hire dedicated node.js developers #hire node.js developers #hire top dedicated node.js developers #hire node.js developers in usa & india #hire node js development company #hire the best node.js developers & programmers

Rylan  Becker

Rylan Becker

1668563924

Machine Learning Tutorial: Step By Step for Beginners

In this Machine Learning article, we learn about Machine Learning Tutorial: step by step for beginners. This Machine Learning tutorial provides both intermediate and basics of machine learning. It is designed for students and working professionals who are complete beginners. At the end of this tutorial, you will be able to make machine learning models that can perform complex tasks such as predicting the price of a house or recognizing the species of an Iris from the dimensions of its petal and sepal lengths. If you are not a complete beginner and are a bit familiar with Machine Learning, I would suggest starting with subtopic eight i.e, Types of Machine Learning.

Before we deep dive further, if you are keen to explore a course in Artificial Intelligence & Machine Learning do check out our Artificial Intelligence Courses available at Great Learning. Anyone could expect an average Salary Hike of 48% from this course. Participate in Great Learning’s career accelerate programs and placement drives and get hired by our pool of 500+ Hiring companies through our programs.

Before jumping into the tutorial, you should be familiar with Pandas and NumPy. This is important to understand the implementation part. There are no prerequisites for understanding the theory. Here are the subtopics that we are going to discuss in this tutorial:

What is Machine Learning?

Arthur Samuel coined the term Machine Learning in the year 1959. He was a pioneer in Artificial Intelligence and computer gaming, and defined Machine Learning as “Field of study that gives computers the capability to learn without being explicitly programmed”.

In simple terms, Machine Learning is an application of Artificial Intelligence (AI) which enables a program(software) to learn from the experiences and improve their self at a task without being explicitly programmed. For example, how would you write a program that can identify fruits based on their various properties, such as colour, shape, size or any other property?

One approach is to hardcode everything, make some rules and use them to identify the fruits. This may seem the only way and work but one can never make perfect rules that apply on all cases. This problem can be easily solved using machine learning without any rules which makes it more robust and practical. You will see how we will use machine learning to do this task in the coming sections.

Thus, we can say that Machine Learning is the study of making machines more human-like in their behaviour and decision making by giving them the ability to learn with minimum human intervention, i.e., no explicit programming. Now the question arises, how can a program attain any experience and from where does it learn? The answer is data. Data is also called the fuel for Machine Learning and we can safely say that there is no machine learning without data.

You may be wondering that the term Machine Learning has been introduced in 1959 which is a long way back, then why haven’t there been any mention of it till recent years? You may want to note that Machine Learning needs a huge computational power, a lot of data and devices which are capable of storing such vast data. We have only recently reached a point where we now have all these requirements and can practice Machine Learning.

How is it different from traditional programming?

Are you wondering how is Machine Learning different from traditional programming? Well, in traditional programming, we would feed the input data and a well written and tested program into a machine to generate output. When it comes to machine learning, input data along with the output associated with the data is fed into the machine during the learning phase, and it works out a program for itself.

Why do we need Machine Learning?

Machine Learning today has all the attention it needs. Machine Learning can automate many tasks, especially the ones that only humans can perform with their innate intelligence. Replicating this intelligence to machines can be achieved only with the help of machine learning. 

With the help of Machine Learning, businesses can automate routine tasks. It also helps in automating and quickly create models for data analysis. Various industries depend on vast quantities of data to optimize their operations and make intelligent decisions. Machine Learning helps in creating models that can process and analyze large amounts of complex data to deliver accurate results. These models are precise and scalable and function with less turnaround time. By building such precise Machine Learning models, businesses can leverage profitable opportunities and avoid unknown risks.

Image recognition, text generation, and many other use-cases are finding applications in the real world. This is increasing the scope for machine learning experts to shine as a sought after professionals. 

How Does Machine Learning Work?

A machine learning model learns from the historical data fed to it and then builds prediction algorithms to predict the output for the new set of data the comes in as input to the system. The accuracy of these models would depend on the quality and amount of input data. A large amount of data will help build a better model which predicts the output more accurately.

Suppose we have a complex problem at hand that requires to perform some predictions. Now, instead of writing a code, this problem could be solved by feeding the given data to generic machine learning algorithms. With the help of these algorithms, the machine will develop logic and predict the output. Machine learning has transformed the way we approach business and social problems. Below is a diagram that briefly explains the working of a machine learning model/ algorithm. our way of thinking about the problem.

History of Machine Learning

Nowadays, we can see some amazing applications of ML such as in self-driving cars, Natural Language Processing and many more. But Machine learning has been here for over 70 years now. It all started in 1943, when neurophysiologist Warren McCulloch and mathematician Walter Pitts wrote a paper about neurons, and how they work. They decided to create a model of this using an electrical circuit, and therefore, the neural network was born.

In 1950, Alan Turing created the “Turing Test” to determine if a computer has real intelligence. To pass the test, a computer must be able to fool a human into believing it is also human. In 1952, Arthur Samuel wrote the first computer learning program. The program was the game of checkers, and the IBM computer improved at the game the more it played, studying which moves made up winning strategies and incorporating those moves into its program.

Just after a few years, in 1957, Frank Rosenblatt designed the first neural network for computers (the perceptron), which simulates the thought processes of the human brain. Later, in 1967, the “nearest neighbor” algorithm was written, allowing computers to begin using very basic pattern recognition. This could be used to map a route for travelling salesmen, starting at a random city but ensuring they visit all cities during a short tour.

But we can say that in the 1990s we saw a big change. Now work on machine learning shifted from a knowledge-driven approach to a data-driven approach.  Scientists began to create programs for computers to analyze large amounts of data and draw conclusions or “learn” from the results.

In 1997, IBM’s Deep Blue became the first computer chess-playing system to beat a reigning world chess champion. Deep Blue used the computing power in the 1990s to perform large-scale searches of potential moves and select the best move. Just a decade before this, in 2006, Geoffrey Hinton created the term “deep learning” to explain new algorithms that help computers distinguish objects and text in images and videos.

Machine Learning at Present

The year 2012 saw the publication of an influential research paper by Alex Krizhevsky, Geoffrey Hinton, and Ilya Sutskever, describing a model that can dramatically reduce the error rate in image recognition systems. Meanwhile, Google’s X Lab developed a machine learning algorithm capable of autonomously browsing YouTube videos to identify the videos that contain cats. In 2016 AlphaGo (created by researchers at Google DeepMind to play the ancient Chinese game of Go) won four out of five matches against Lee Sedol, who has been the world’s top Go player for over a decade.

And now in 2020, OpenAI released GPT-3 which is the most powerful language model ever. It can write creative fiction, generate functioning code, compose thoughtful business memos and much more. Its possible use cases are limited only by our imaginations.

Features of Machine Learning

1. Automation: Nowadays in your Gmail account, there is a spam folder that contains all the spam emails. You might be wondering how does Gmail know that all these emails are spam? This is the work of Machine Learning. It recognizes the spam emails and thus, it is easy to automate this process. The ability to automate repetitive tasks is one of the biggest characteristics of machine learning. A huge number of organizations are already using machine learning-powered paperwork and email automation. In the financial sector, for example, a huge number of repetitive, data-heavy and predictable tasks are needed to be performed. Because of this, this sector uses different types of machine learning solutions to a great extent.

2. Improved customer experience: For any business, one of the most crucial ways to drive engagement, promote brand loyalty and establish long-lasting customer relationships is by providing a customized experience and providing better services. Machine Learning helps us to achieve both of them. Have you ever noticed that whenever you open any shopping site or see any ads on the internet, they are mostly about something that you recently searched for? This is because machine learning has enabled us to make amazing recommendation systems that are accurate. They help us customize the user experience. Now coming to the service, most of the companies nowadays have a chatting bot with them that are available 24×7. An example of this is Eva from AirAsia airlines. These bots provide intelligent answers and sometimes you might even not notice that you are having a conversation with a bot. These bots use Machine Learning, which helps them to provide a good user experience.

3. Automated data visualization: In the past, we have seen a huge amount of data being generated by companies and individuals. Take an example of companies like Google, Twitter, Facebook. How much data are they generating per day? We can use this data and visualize the notable relationships, thus giving businesses the ability to make better decisions that can actually benefit both companies as well as customers. With the help of user-friendly automated data visualization platforms such as AutoViz, businesses can obtain a wealth of new insights in an effort to increase productivity in their processes.

4. Business intelligence: Machine learning characteristics, when merged with big data analytics can help companies to find solutions to the problems that can help the businesses to grow and generate more profit. From retail to financial services to healthcare, and many more, ML has already become one of the most effective technologies to boost business operations.

Python provides flexibility in choosing between object-oriented programming or scripting. There is also no need to recompile the code; developers can implement any changes and instantly see the results. You can use Python along with other languages to achieve the desired functionality and results.

Python is a versatile programming language and can run on any platform including Windows, MacOS, Linux, Unix, and others. While migrating from one platform to another, the code needs some minor adaptations and changes, and it is ready to work on the new platform. To build strong foundation and cover basic concepts you can enroll in a python machine learning course that will help you power ahead your career.

Here is a summary of the benefits of using Python for Machine Learning problems:

machine learning tutorial

Types of Machine Learning

Machine learning has been broadly categorized into three categories

  1. Supervised Learning
  2. Unsupervised Learning
  3. Reinforcement Learning

What is Supervised Learning?

Let us start with an easy example, say you are teaching a kid to differentiate dogs from cats. How would you do it? 

You may show him/her a dog and say “here is a dog” and when you encounter a cat you would point it out as a cat. When you show the kid enough dogs and cats, he may learn to differentiate between them. If he is trained well, he may be able to recognize different breeds of dogs which he hasn’t even seen. 

Similarly, in Supervised Learning, we have two sets of variables. One is called the target variable, or labels (the variable we want to predict) and features(variables that help us to predict target variables). We show the program(model) the features and the label associated with these features and then the program is able to find the underlying pattern in the data. Take this example of the dataset where we want to predict the price of the house given its size. The price which is a target variable depends upon the size which is a feature.

Number of roomsPrice
1$100
3$300
5$500

In a real dataset, we will have a lot more rows and more than one features like size, location, number of floors and many more.

Thus, we can say that the supervised learning model has a set of input variables (x), and an output variable (y). An algorithm identifies the mapping function between the input and output variables. The relationship is y = f(x).

The learning is monitored or supervised in the sense that we already know the output and the algorithm are corrected each time to optimize its results. The algorithm is trained over the data set and amended until it achieves an acceptable level of performance.

We can group the supervised learning problems as:

Regression problems – Used to predict future values and the model is trained with the historical data. E.g., Predicting the future price of a house.

Classification problems – Various labels train the algorithm to identify items within a specific category. E.g., Dog or cat( as mentioned in the above example), Apple or an orange, Beer or wine or water.

What is Unsupervised Learning?

This approach is the one where we have no target variables, and we have only the input variable(features) at hand. The algorithm learns by itself and discovers an impressive structure in the data. 

The goal is to decipher the underlying distribution in the data to gain more knowledge about the data. 

We can group the unsupervised learning problems as:

Clustering: This means bundling the input variables with the same characteristics together. E.g., grouping users based on search history

Association: Here, we discover the rules that govern meaningful associations among the data set. E.g., People who watch ‘X’ will also watch ‘Y’.

What is Reinforcement Learning?

In this approach, machine learning models are trained to make a series of decisions based on the rewards and feedback they receive for their actions. The machine learns to achieve a goal in complex and uncertain situations and is rewarded each time it achieves it during the learning period. 

Reinforcement learning is different from supervised learning in the sense that there is no answer available, so the reinforcement agent decides the steps to perform a task. The machine learns from its own experiences when there is no training data set present.

In this tutorial, we are going to mainly focus on Supervised Learning and Unsupervised learning as these are quite easy to understand and implement.

Machine learning Algorithms

This may be the most time-consuming and difficult process in your journey of Machine Learning. There are many algorithms in Machine Learning and you don’t need to know them all in order to get started. But I would suggest, once you start practising Machine Learning, start learning about the most popular algorithms out there such as:

Here, I am going to give a brief overview of one of the simplest algorithms in Machine learning, the K-nearest neighbor Algorithm (which is a Supervised learning algorithm) and show how we can use it for Regression as well as for classification. I would highly recommend checking the Linear Regression and Logistic Regression as we are going to implement them and compare the results with KNN(K-nearest neighbor) algorithm in the implementation part.

You may want to note that there are usually separate algorithms for regression problems and classification problems. But by modifying an algorithm, we can use it for both classifications as well as regression as you will see below

K-Nearest Neighbor Algorithm

KNN belongs to a group of lazy learners. As opposed to eager learners such as logistic regression, SVM, neural nets, lazy learners just store the training data in memory. During the training phase, KNN arranges the data (sort of indexing process) in order to find the closest neighbours efficiently during the inference phase. Otherwise, it would have to compare each new case during inference with the whole dataset making it quite inefficient.

So if you are wondering what is a training phase, eager learners and lazy learners, for now just remember that training phase is when an algorithm learns from the data provided to it. For example, if you have gone through the Linear Regression algorithm linked above, during the training phase the algorithm tries to find the best fit line which is a process that includes a lot of computations and hence takes a lot of time and this type of algorithm is called eager learners. On the other hand, lazy learners are just like KNN which do not involve many computations and hence train faster.

K-NN for Classification Problem

Now let us see how we can use K-NN for classification. Here a hypothetical dataset which tries to predict if a person is male or female (labels) on the base of the height and weight (features).

Height(cm) -featureWeight(kg) -feature.Gender(label)
18780Male
16550Female
19999Male
14570Female
18087Male
17865Female
18760Male

Now let us plot these points:

K-NN algorithm

Now we have a new point that we want to classify, given that its height is 190 cm and weight is 100 Kg. Here is how K-NN will classify this point:

  1. Select the value of K, which the user selects which he thinks will be best after analysing the data.
  2. Measure the distance of new points from its nearest K number of points. There are various methods for calculating this distance, of which the most commonly known methods are – Euclidian, Manhattan (for continuous data points i.e regression problems) and Hamming distance (for categorical i.e for classification problems).
  3. Identify the class of the points that are more closer to the new point and label the new point accordingly. So if the majority of points closer to our new point belong to a certain “a” class than our new point is predicted to be from class “a”.

Now let us apply this algorithm to our own dataset. Let us first plot the new data point.

K-NN algorithm

Now let us take k=3 i.e, we will see the three closest points to the new point:

K-NN algorithm

Therefore, it is classified as Male:

K-NN algorithm

Now let us take the value of k=5 and see what happens:

K-NN algorithm

As we can see four of the points closest to our new data point are males and just one point is female, so we go with the majority and classify it as Male again. You must always select the value of K as an odd number when doing classification.

K-NN for a Regression problem

We have seen how we can use K-NN for classification. Now, let us see what changes are made to use it for regression. The algorithm is almost the same there is just one difference. In Classification, we checked for the majority of all nearest points. Here, we are going to take the average of all the nearest points and take that as predicted value. Let us again take the same example but here we have to predict the weight(label) of a person given his height(features).

Height(cm) -featureWeight(kg) -label
18780
16550
19999
14570
18087
17865
18760

Now we have new data point with a height of 160cm, we will predict its weight by taking the values of K as 1,2 and 4.

When K=1: The closest point to 160cm in our data is 165cm which has a weight of 50, so we conclude that the predicted weight is 50 itself.

When K=2: The two closest points are 165 and 145 which have weights equal to 50 and 70 respectively. Taking average we say that the predicted weight is (50+70)/2=60.

When K=4: Repeating the same process, now we take 4 closest points instead and hence we get 70.6 as predicted weight.

You might be thinking that this is really simple and there is nothing so special about Machine learning, it is just basic Mathematics. But remember this is the simplest algorithm and you will see much more complex algorithms once you move ahead in this journey.

At this stage, you must have a vague idea of how machine learning works, don’t worry if you are still confused. Also if you want to go a bit deep now, here is an excellent article – Gradient Descent in Machine Learning, which discusses how we use an optimization technique called as gradient descent to find a best-fit line in linear regression.

How To Choose Machine Learning Algorithm?

There are plenty of machine learning algorithms and it could be a tough task to decide which algorithm to choose for a specific application. The choice of the algorithm will depend on the objective of the problem you are trying to solve.

Let us take an example of a task to predict the type of fruit among three varieties, i.e., apple, banana, and orange. The predictions are based on the colour of the fruit. The picture depicts the results of ten different algorithms. The picture on the top left is the dataset. The data is classified into three categories: red, light blue and dark blue. There are some groupings. For instance, from the second image, everything in the upper left belongs to the red category, in the middle part, there is a mixture of uncertainty and light blue while the bottom corresponds to the dark category. The other images show different algorithms and how they try to classified the data.

Steps in Machine Learning

I wish Machine learning was just applying algorithms on your data and get the predicted values but it is not that simple. There are several steps in Machine Learning which are must for each project.

  1. Gathering Data: This is perhaps the most important and time-consuming process. In this step, we need to collect data that can help us to solve our problem. For example, if you want to predict the prices of the houses, we need an appropriate dataset that contains all the information about past house sales and then form a tabular structure. We are going to solve a similar problem in the implementation part.
  2. Preparing that data: Once we have the data, we need to bring it in proper format and preprocess it. There are various steps involved in pre-processing such as data cleaning, for example, if your dataset has some empty values or abnormal values(e.g, a string instead of a number) how are you going to deal with it? There are various ways in which we can but one simple way is to just drop the rows that have empty values. Also sometimes in the dataset, we might have columns that have no impact on our results such as id’s, we remove those columns as well. We usually use Data Visualization to visualize our data through graphs and diagrams and after analyzing the graphs, we decide which features are important. Data preprocessing is a vast topic and I would suggest checking out this article to know more about it.
  3. Choosing a model: Now our data is ready is to be fed into a Machine Learning algorithm. In case you are wondering what is a Model? Often “machine learning algorithm” is used interchangeably with “machine learning model.” A model is the output of a machine learning algorithm run on data. In simple terms when we implement the algorithm on all our data, we get an output which contains all the rules, numbers, and any other algorithm-specific data structures required to make predictions. For example, after implementing Linear Regression on our data we get an equation of the best fit line and this equation is termed as a model. The next step is usually training the model incase we don’t want to tune hyperparameters and select the default ones.
  4. Hyperparameter Tuning: Hyperparameters are crucial as they control the overall behavior of a machine learning model. The ultimate goal is to find an optimal combination of hyperparameters that gives us the best results. But what are these hyper-parameters? Remember the variable K in our K-NN algorithm. We got different results when we set different values of K. The best value for K is not predefined and is different for different datasets. There is no method to know the best value for K, but you can try different values and check for which value do we get the best results. Here K is a hyperparameter and each algorithm has its own hyperparameters and we need to tune their values to get the best results. To get more information about it, check out this article – Hyperparameter Tuning Explained.
  5. Evaluation: You may be wondering, how can you know if the model is performing good or bad. What better way than testing the model on some data. This data is known as testing data and it must not be a subset of the data (training data) on which we trained the algorithm. The objective of training the model is not for it to learn all the values in the training dataset but to identify the underlying pattern in data and based on that make predictions on data it has never seen before. There are various evaluation methods such as K-fold cross-validation and many more. We are going to discuss this step in detail in the coming section.
  6. Prediction: Now that our model has performed well on the testing set as well, we can use it in real-world and hope it is going to perform well on real-world data.

machine learning tutorial

Evaluation of Machine learning Model

For evaluating the model, we hold out a portion of data called test data and do not use this data to train the model. Later, we use test data to evaluate various metrics.

The results of predictive models can be viewed in various forms such as by using confusion matrix, root-mean-squared error(RMSE), AUC-ROC etc.

TP (True Positive) is the number of values predicted to be positive by the algorithm and was actually positive in the dataset. TN represents the number of values that are expected to not belong to the positive class and actually do not belong to it. FP depicts the number of instances misclassified as belonging to the positive class thus is actually part of the negative class. FN shows the number of instances classified as the negative class but should belong to the positive class. 

Now in Regression problem, we usually use RMSE as evaluation metrics. In this evaluation technique, we use the error term.

Let’s say you feed a model some input X and the model predicts 10, but the actual value is 5. This difference between your prediction (10) and the actual observation (5) is the error term: (f_prediction – i_actual). The formula to calculate RMSE is given by:

machine learning tutorial

Where N is a total number of samples for which we are calculating RMSE.

In a good model, the RMSE should be as low as possible and there should not be much difference between RMSE calculated over training data and RMSE calculated over the testing set. 

Python for Machine Learning

Although there are many languages that can be used for machine learning, according to me, Python is hands down the best programming language for Machine Learning applications. This is due to the various benefits mentioned in the section below. Other programming languages that could to use for Machine Learning Applications are R, C++, JavaScript, Java, C#, Julia, Shell, TypeScript, and Scala. R is also a really good language to get started with machine learning.

Python is famous for its readability and relatively lower complexity as compared to other programming languages. Machine Learning applications involve complex concepts like calculus and linear algebra which take a lot of effort and time to implement. Python helps in reducing this burden with quick implementation for the Machine Learning engineer to validate an idea. You can check out the Python Tutorial to get a basic understanding of the language. Another benefit of using Python in Machine Learning is the pre-built libraries. There are different packages for a different type of applications, as mentioned below:

  1. Numpy, OpenCV, and Scikit are used when working with images
  2. NLTK along with Numpy and Scikit again when working with text
  3. Librosa for audio applications
  4. Matplotlib, Seaborn, and Scikit for data representation
  5. TensorFlow and Pytorch for Deep Learning applications
  6. Scipy for Scientific Computing
  7. Django for integrating web applications
  8. Pandas for high-level data structures and analysis

Implementation of algorithms in Machine Learning with Python

Before moving on to the implementation of machine learning with Python part, you need to download some important software and libraries. Anaconda is an open-source distribution that makes it easy to perform Python/R data science and machine learning on a single machine. It contains all most all the libraries that are needed by us. In this tutorial, we are mostly going to use the scikit-learn library which is a free software machine learning library for the Python programming language.

Now, we are going to implement all that we learnt till now. We will solve a Regression problem and then a Classification problem using the seven steps mentioned above.

Implementation of a Regression problem

We have a problem of predicting the prices of the house given some features such as size, number of rooms and many more. So let us get started:

  1. Gathering data: We don’t need to manually collect the data for past sales of houses. Luckily there are some good people who do it for us and make these datasets available for us to use. Also let me mention not all datasets are free but for you to practice, you will find most of the datasets free to use on the internet.

The dataset we are using is called the Boston Housing dataset. Each record in the database describes a Boston suburb or town. The data was drawn from the Boston Standard Metropolitan Statistical Area (SMSA) in 1970. The attributes are defined as follows (taken from the UCI Machine Learning Repository).

  1. CRIM: per capita crime rate by town
  2. ZN: proportion of residential land zoned for lots over 25,000 sq.ft.
  3. INDUS: proportion of non-retail business acres per town
  4. CHAS: Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
  5. NOX: nitric oxides concentration (parts per 10 million)
  6. RM: average number of rooms per dwelling
  7. AGE: the proportion of owner-occupied units built prior to 1940
  8. DIS: weighted distances to five Boston employment centers
  9. RAD: index of accessibility to radial highways
  10. TAX: full-value property-tax rate per $10,000
  11. PTRATIO: pupil-teacher ratio by town 
  12. B: 1000(Bk−0.63)2 where Bk is the proportion of blacks by town 
  13. LSTAT: % lower status of the population
  14. MEDV: Median value of owner-occupied homes in $1000s

Here is a link to download this dataset.

Now after opening the file you can see the data about House sales. This dataset is not in a proper tabular form, in fact, there are no column names and each value is separated by spaces. We are going to use Pandas to put it in proper tabular form. We will provide it with a list containing column names and also use delimiter as ‘\s+’ which means that after encounterings a single or multiple spaces, it can differentiate every single entry.

We are going to import all the necessary libraries such as Pandas and NumPy. Next, we will import the data file which is in CSV format into a pandas DataFrame.

import numpy as np
import pandas as pd
column_names = ['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX','PTRATIO', 'B', 'LSTAT', 'MEDV']
bos1 = pd.read_csv('housing.csv', delimiter=r"\s+", names=column_names)

machine learning tutorial

2. Preprocess Data: The next step is to pre-process the data. Now for this dataset, we can see that there are no NaN (missing) values and also all the data is in numbers rather than strings so we won’t face any errors when training the model. So let us just divide our data into training data and testing data such that 70% of data is training data and the rest is testing data. We could also scale our data to make the predictions much accurate but for now, let us keep it simple.

bos1.isna().sum()

machine learning tutorial

from sklearn.model_selection import train_test_split
X=np.array(bos1.iloc[:,0:13])
Y=np.array(bos1["MEDV"])
#testing data size is of 30% of entire data
x_train, x_test, y_train, y_test =train_test_split(X,Y, test_size = 0.30, random_state =5)

3. Choose a Model: For this particular problem, we are going to use two algorithms of supervised learning that can solve regression problems and later compare their results. One algorithm is K-NN (K-nearest Neighbor) which is explained above and the other is Linear Regression. I would highly recommend to check it out in case you haven’t already.

from sklearn.linear_model import LinearRegression
from sklearn.neighbors import KNeighborsRegressor
#load our first model 
lr = LinearRegression()
#train the model on training data
lr.fit(x_train,y_train)
#predict the testing data so that we can later evaluate the model
pred_lr = lr.predict(x_test)
#load the second model
Nn=KNeighborsRegressor(3)
Nn.fit(x_train,y_train)
pred_Nn = Nn.predict(x_test)

4. Hyperparameter Tuning: Since this is a beginners tutorial, here, I am only going to turn the value ok K in the K-NN model. I will just use a for loop and check results of k ranging from 1 to 50. K-NN is extremely fast on small dataset like ours so it won’t take any time. There are much more advanced methods of doing this which you can find linked in the steps of Machine Learning section above.

import sklearn
for i in range(1,50):
    model=KNeighborsRegressor(i)
    model.fit(x_train,y_train)
    pred_y = model.predict(x_test)
    mse = sklearn.metrics.mean_squared_error(y_test, pred_y,squared=False)
    print("{} error for k = {}".format(mse,i))

Output:

machine learning tutorial

From the output, we can see that error is least for k=3, so that should justify why I put the value of K=3 while training the model

5. Evaluating the model: For evaluating the model we are going to use the mean_squared_error() method from the scikit-learn library. Remember to set the parameter ‘squared’ as False, to get the RMSE error.

#error for linear regression
mse_lr= sklearn.metrics.mean_squared_error(y_test, pred_lr,squared=False)
print("error for Linear Regression = {}".format(mse_lr))
#error for linear regression
mse_Nn= sklearn.metrics.mean_squared_error(y_test, pred_Nn,squared=False)
print("error for K-NN = {}".format(mse_Nn))

Now from the results, we can conclude that Linear Regression performs better than K-NN for this particular dataset. But It is not necessary that Linear Regression would always perform better than K-NN as it completely depends upon the data that we are working with.

6. Prediction: Now we can use the models to predict the prices of the houses using the predict function as we did above. Make sure when predicting the prices that we are given all the features that were present when training the model.

Here is the whole script:

import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.neighbors import KNeighborsRegressor
column_names = ['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO', 'B', 'LSTAT', 'MEDV']
bos1 = pd.read_csv('housing.csv', delimiter=r"\s+", names=column_names)
X=np.array(bos1.iloc[:,0:13])
Y=np.array(bos1["MEDV"])
#testing data size is of 30% of entire data
x_train, x_test, y_train, y_test =train_test_split(X,Y, test_size = 0.30, random_state =54)
#load our first model 
lr = LinearRegression()
#train the model on training data
lr.fit(x_train,y_train)
#predict the testing data so that we can later evaluate the model
pred_lr = lr.predict(x_test)
#load the second model
Nn=KNeighborsRegressor(12)
Nn.fit(x_train,y_train)
pred_Nn = Nn.predict(x_test)
#error for linear regression
mse_lr= sklearn.metrics.mean_squared_error(y_test, pred_lr,squared=False)
print("error for Linear Regression = {}".format(mse_lr))
#error for linear regression
mse_Nn= sklearn.metrics.mean_squared_error(y_test, pred_Nn,squared=False)
print("error for K-NN = {}".format(mse_Nn))

Implementation of a Classification problem

In this section, we will solve the population classification problem known as Iris Classification problem. The Iris dataset was used in R.A. Fisher’s classic 1936 paper, The Use of Multiple Measurements in Taxonomic Problems, and can also be found on the UCI Machine Learning Repository.

It includes three iris species with 50 samples each as well as some properties about each flower. One flower species is linearly separable from the other two, but the other two are not linearly separable from each other. The columns in this dataset are:

speicies of iris

Different species of iris

  • SepalLengthCm
  • SepalWidthCm
  • PetalLengthCm
  • PetalWidthCm
  • Species

We don’t need to download this dataset as scikit-learn library already contains this dataset and we can simply import it from there. So let us start coding this up:

from sklearn.datasets import load_iris
iris = load_iris()
X=iris.data
Y=iris.target
print(X)
print(Y)

As we can see, the features are in a list containing four items which are the features and at the bottom, we got a list containing labels which have been transformed into numbers as the model cannot understand names that are strings, so we encode each name as a number. This has already done by the scikit learn developers.

from sklearn.model_selection import train_test_split
#testing data size is of 30% of entire data
x_train, x_test, y_train, y_test =train_test_split(X,Y, test_size = 0.3, random_state =5)
from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import KNeighborsClassifier
#fitting our model to train and test
Nn = KNeighborsClassifier(8)
Nn.fit(x_train,y_train)
#the score() method calculates the accuracy of model.
print("Accuracy for K-NN is ",Nn.score(x_test,y_test))
Lr = LogisticRegression()
Lr.fit(x_train,y_train)
print("Accuracy for Logistic Regression is ",Lr.score(x_test,y_test))

Advantages of Machine Learning

1. Easily identifies trends and patterns

Machine Learning can review large volumes of data and discover specific trends and patterns that would not be apparent to humans. For instance, for e-commerce websites like Amazon and Flipkart, it serves to understand the browsing behaviors and purchase histories of its users to help cater to the right products, deals, and reminders relevant to them. It uses the results to reveal relevant advertisements to them.

2. Continuous Improvement

We are continuously generating new data and when we provide this data to the Machine Learning model which helps it to upgrade with time and increase its performance and accuracy. We can say it is like gaining experience as they keep improving in accuracy and efficiency. This lets them make better decisions.

3. Handling multidimensional and multi-variety data

Machine Learning algorithms are good at handling data that are multidimensional and multi-variety, and they can do this in dynamic or uncertain environments.

4. Wide Applications

You could be an e-tailer or a healthcare provider and make Machine Learning work for you. Where it does apply, it holds the capability to help deliver a much more personal experience to customers while also targeting the right customers.

Disadvantages of Machine Learning

1. Data Acquisition

Machine Learning requires a massive amount of data sets to train on, and these should be inclusive/unbiased, and of good quality. There can also be times where we must wait for new data to be generated.

2. Time and Resources

Machine Learning needs enough time to let the algorithms learn and develop enough to fulfill their purpose with a considerable amount of accuracy and relevancy. It also needs massive resources to function. This can mean additional requirements of computer power for you.

3. Interpretation of Results

Another major challenge is the ability to accurately interpret results generated by the algorithms. You must also carefully choose the algorithms for your purpose. Sometimes, based on some analysis you might select an algorithm but it is not necessary that this model is best for the problem.

4. High error-susceptibility

Machine Learning is autonomous but highly susceptible to errors. Suppose you train an algorithm with data sets small enough to not be inclusive. You end up with biased predictions coming from a biased training set. This leads to irrelevant advertisements being displayed to customers. In the case of Machine Learning, such blunders can set off a chain of errors that can go undetected for long periods of time. And when they do get noticed, it takes quite some time to recognize the source of the issue, and even longer to correct it.

Future of Machine Learning

Machine Learning can be a competitive advantage to any company, be it a top MNC or a startup. As things that are currently being done manually will be done tomorrow by machines. With the introduction of projects such as self-driving cars, Sophia(a humanoid robot developed by Hong Kong-based company Hanson Robotics) we have already started a glimpse of what the future can be. The Machine Learning revolution will stay with us for long and so will be the future of Machine Learning.

Machine Learning Tutorial FAQs

How do I start learning Machine Learning?

You first need to start with the basics. You need to understand the prerequisites, which include learning Linear Algebra and Multivariate Calculus, Statistics, and Python. Then you need to learn several ML concepts, which include terminology of Machine Learning, types of Machine Learning, and Resources of Machine Learning. The third step is taking part in competitions. You can also take up a free online statistics for machine learning course and understand the foundational concepts.

Is Machine Learning easy for beginners? 

Machine Learning is not the easiest. The difficulty in learning Machine Learning is the debugging problem. However, if you study the right resources, you will be able to learn Machine Learning without any hassles.

What is a simple example of Machine Learning? 

Recommendation Engines (Netflix); Sorting, tagging and categorizing photos (Yelp); Customer Lifetime Value (Asos); Self-Driving Cars (Waymo); Education (Duolingo); Determining Credit Worthiness (Deserve); Patient Sickness Predictions (KenSci); and Targeted Emails (Optimail).

Can I learn Machine Learning in 3 months? 

Machine Learning is vast and consists of several things. Therefore, it will take you around six months to learn it, provided you spend at least 5-6 days every day. Also, the time taken to learn Machine Learning depends a lot on your mathematical and analytical skills.

Does Machine Learning require coding? 

If you are learning traditional Machine Learning, it would require you to know software programming as it will help you to write machine learning algorithms. However, through some online educational platforms, you do not need to know coding to learn Machine Learning.

Is Machine Learning a good career? 

Machine Learning is one of the best careers at present. Whether it is for the current demand, job, and salary growth, Machine Learning Engineer is one of the best profiles. You need to be very good at data, automation, and algorithms.

Can I learn Machine Learning without Python? 

To learn Machine Learning, you need to have some basic knowledge of Python. A version of Python that is supported by all Operating Systems such as Windows, Linux, etc., is Anaconda. It offers an overall package for machine learning, including matplotlib, scikit-learn, and NumPy.

Where can I practice Machine Learning? 

The online platforms where you can practice Machine Learning include CloudXLab, Google Colab, Kaggle, MachineHack, and OpenML.

Where can I learn Machine Learning for free?

You can learn the basics of Machine Learning from online platforms like Great Learning. You can enroll in the Beginners Machine Learning course and get the certificate for free. The course is easy and perfect for beginners to start with.


Original article source at: https://www.mygreatlearning.com

#machine-learning 

Amazon Rekognition Video Analyzer Written in Opencv

Create a Serverless Pipeline for Video Frame Analysis and Alerting

Introduction

Imagine being able to capture live video streams, identify objects using deep learning, and then trigger actions or notifications based on the identified objects -- all with low latency and without a single server to manage.

This is exactly what this project is going to help you accomplish with AWS. You will be able to setup and run a live video capture, analysis, and alerting solution prototype.

The prototype was conceived to address a specific use case, which is alerting based on a live video feed from an IP security camera. At a high level, the solution works as follows. A camera surveils a particular area, streaming video over the network to a video capture client. The client samples video frames and sends them over to AWS, where they are analyzed and stored along with metadata. If certain objects are detected in the analyzed video frames, SMS alerts are sent out. Once a person receives an SMS alert, they will likely want to know what caused it. For that, sampled video frames can be monitored with low latency using a web-based user interface.

Here's the prototype's conceptual architecture:

Architecture

Let's go through the steps necessary to get this prototype up and running. If you are starting from scratch and are not familiar with Python, completing all steps can take a few hours.

Preparing your development environment

Here’s a high-level checklist of what you need to do to setup your development environment.

  1. Sign up for an AWS account if you haven't already and create an Administrator User. The steps are published here.
  2. Ensure that you have Python 2.7+ and Pip on your machine. Instructions for that varies based on your operating system and OS version.
  3. Create a Python virtual environment for the project with Virtualenv. This helps keep project’s python dependencies neatly isolated from your Operating System’s default python installation. Once you’ve created a virtual python environment, activate it before moving on with the following steps.
  4. Use Pip to install AWS CLI. Configure the AWS CLI. It is recommended that the access keys you configure are associated with an IAM User who has full access to the following:
  • Amazon S3
  • Amazon DynamoDB
  • Amazon Kinesis
  • AWS Lambda
  • Amazon CloudWatch and CloudWatch Logs
  • AWS CloudFormation
  • Amazon Rekognition
  • Amazon SNS
  • Amazon API Gateway
  • Creating IAM Roles

The IAM User can be the Administrator User you created in Step 1.

5.   Make sure you choose a region where all of the above services are available. Regions us-east-1 (N. Virginia), us-west-2 (Oregon), and eu-west-1 (Ireland) fulfill this criterion. Visit this page to learn more about service availability in AWS regions.

6.   Use Pip to install Open CV 3 python dependencies and then compile, build, and install Open CV 3 (required by Video Cap clients). You can follow this guide to get Open CV 3 up and running on OS X Sierra with Python 2.7. There's another guide for Open CV 3 and Python 3.5 on OS X Sierra. Other guides exist as well for Windows and Raspberry Pi.

7.   Use Pip to install Boto3. Boto is the Amazon Web Services (AWS) SDK for Python, which allows Python developers to write software that makes use of Amazon services like S3 and EC2. Boto provides an easy to use, object-oriented API as well as low-level direct access to AWS services.

8.   Use Pip to install Pynt. Pynt enables you to write project build scripts in Python.

9.   Clone this GitHub repository. Choose a directory path for your project that does not contain spaces (I'll refer to the full path to this directory as <path-to-project-dir>).

10.   Use Pip to install pytz. Pytz is needed for timezone calculations. Use the following commands:

pip install pytz # Install pytz in your virtual python env

pip install pytz -t <path-to-project-dir>/lambda/imageprocessor/ # Install pytz to be packaged and deployed with the Image Processor lambda function

Finally, obtain an IP camera. If you don’t have an IP camera, you can use your smartphone with an IP camera app. This is useful in case you want to test things out before investing in an IP camera. Also, you can simply use your laptop’s built-in camera or a connected USB camera. If you use an IP camera, make sure your camera is connected to the same Local Area Network as the Video Capture client.

Configuring the project

In this section, I list every configuration file, parameters within it, and parameter default values. The build commands detailed later extract the majority of their parameters from these configuration files. Also, the prototype's two AWS Lambda functions - Image Processor and Frame Fetcher - extract parameters at runtime from imageprocessor-params.json and framefetcher-params.json respectively.

NOTE: Do not remove any of the attributes already specified in these files.

NOTE: You must set the value of any parameter that has the tag NO-DEFAULT

config/global-params.json

Specifies “global” build configuration parameters. It is read by multiple build scripts.

{
    "StackName" : "video-analyzer-stack"
}

Parameters:

  • StackName - The name of the stack to be created in your AWS account.

config/cfn-params.json

Specifies and overrides default values of AWS CloudFormation parameters defined in the template (located at aws-infra/aws-infra-cfn.yaml). This file is read by a number of build scripts, including createstack, deploylambda, and webui.

{
    "SourceS3BucketParameter" : "<NO-DEFAULT>",
    "ImageProcessorSourceS3KeyParameter" : "src/lambda_imageprocessor.zip",
    "FrameFetcherSourceS3KeyParameter" : "src/lambda_framefetcher.zip",

    "FrameS3BucketNameParameter" : "<NO-DEFAULT>",

    "FrameFetcherApiResourcePathPart" : "enrichedframe",
    "ApiGatewayRestApiNameParameter" : "VidAnalyzerRestApi",
    "ApiGatewayStageNameParameter": "development",
    "ApiGatewayUsagePlanNameParameter" : "development-plan"
}

Parameters:

SourceS3BucketParameter - The Amazon S3 bucket to which your AWS Lambda function packages (.zip files) will be deployed. If a bucket with such a name does not exist, the deploylambda build command will create it for you with appropriate permissions. AWS CloudFormation will access this bucket to retrieve the .zip files for Image Processor and Frame Fetcher AWS Lambda functions.

ImageProcessorSourceS3KeyParameter - The Amazon S3 key under which the Image Processor function .zip file will be stored.

FrameFetcherSourceS3KeyParameter - The Amazon S3 key under which the Frame Fetcher function .zip file will be stored.

FrameS3BucketNameParameter - The Amazon S3 bucket that will be used for storing video frame images. There must not be an existing S3 bucket with the same name.

FrameFetcherApiResourcePathPart - The name of the Frame Fetcher API resource path part in the API Gateway URL.

ApiGatewayRestApiNameParameter - The name of the API Gateway REST API to be created by AWS CloudFormation.

ApiGatewayStageNameParameter - The name of the API Gateway stage to be created by AWS CloudFormation.

ApiGatewayUsagePlanNameParameter - The name of the API Gateway usage plan to be created by AWS CloudFormation.

config/imageprocessor-params.json

Specifies configuration parameters to be used at run-time by the Image Processor lambda function. This file is packaged along with the Image Processor lambda function code in a single .zip file using the packagelambda build script.

{
    "s3_bucket" : "<NO-DEFAULT>",
    "s3_key_frames_root" : "frames/",

    "ddb_table" : "EnrichedFrame",

    "rekog_max_labels" : 123,
    "rekog_min_conf" : 50.0,

    "label_watch_list" : ["Human", "Pet", "Bag", "Toy"],
    "label_watch_min_conf" : 90.0,
    "label_watch_phone_num" : "",
    "label_watch_sns_topic_arn" : "",
    "timezone" : "US/Eastern"
}

s3_bucket - The Amazon S3 bucket in which Image Processor will store captured video frame images. The value specified here must match the value specified for the FrameS3BucketNameParameter parameter in the cfn-params.json file.

s3_key_frames_root - The Amazon S3 key prefix that will be prepended to the keys of all stored video frame images.

ddb_table - The Amazon DynamoDB table in which Image Processor will store video frame metadata. The default value,EnrichedFrame, matches the default value of the AWS CloudFormation template parameter DDBTableNameParameter in the aws-infra/aws-infra-cfn.yaml template file.

rekog_max_labels - The maximum number of labels that Amazon Rekognition can return to Image Processor.

rekog_min_conf - The minimum confidence required for a label identified by Amazon Rekognition. Any labels with confidence below this value will not be returned to Image Processor.

label_watch_list - A list of labels for to watch out for. If any of the labels specified in this parameter are returned by Amazon Rekognition, an SMS alert will be sent via Amazon SNS. The label's confidence must exceed label_watch_min_conf.

label_watch_min_conf - The minimum confidence required for a label to trigger a Watch List alert.

label_watch_phone_num - The mobile phone number to which a Watch List SMS alert will be sent. Does not have a default value. You must configure a valid phone number adhering to the E.164 format (e.g. +1404XXXYYYY) for the Watch List feature to become active.

label_watch_sns_topic_arn - The SNS topic ARN to which you want Watch List alert messages to be sent. The alert message contains a notification text in addition to a JSON formatted list of Watch List labels found. This can be used to publish alerts to any SNS subscribers, such as Amazon SQS queues.

timezone - The timezone used to report time and date in SMS alerts. By default, it is "US/Eastern". See this list of country codes, names, continents, capitals, and pytz timezones).

config/framefetcher-params.json

Specifies configuration parameters to be used at run-time by the Frame Fetcher lambda function. This file is packaged along with the Frame Fetcher lambda function code in a single .zip file using the packagelambda build script.

{
    "s3_pre_signed_url_expiry" : 1800,

    "ddb_table" : "EnrichedFrame",
    "ddb_gsi_name" : "processed_year_month-processed_timestamp-index",

    "fetch_horizon_hrs" : 24,
    "fetch_limit" : 3
}

s3_pre_signed_url_expiry - Frame Fetcher returns video frame metadata. Along with the returned metadata, Frame Fetcher generates and returns a pre-signed URL for every video frame. Using a pre-signed URL, a client (such as the Web UI) can securely access the JPEG image associated with a particular frame. By default, the pre-signed URLs expire in 30 minutes.

ddb_table - The Amazon DynamoDB table from which Frame Fetcher will fetch video frame metadata. The default value,EnrichedFrame, matches the default value of the AWS CloudFormation template parameter DDBTableNameParameter in the aws-infra/aws-infra-cfn.yaml template file.

ddb_gsi_name - The name of the Amazon DynamoDB Global Secondary Index that Frame Fetcher will use to query frame metadata. The default value matches the default value of the AWS CloudFormation template parameter DDBGlobalSecondaryIndexNameParameter in the aws-infra/aws-infra-cfn.yaml template file.

fetch_horizon_hrs - Frame Fetcher will exclude any video frames that were ingested prior to the point in the past represented by (time now - fetch_horizon_hrs).

fetch_limit - The maximum number of video frame metadata items that Frame Fetcher will retrieve from Amazon DynamoDB.

Building the prototype

Common interactions with the project have been simplified for you. Using pynt, the following tasks are automated with simple commands:

  • Creating, deleting, and updating the AWS infrastructure stack with AWS CloudFormation
  • Packaging lambda code into .zip files and deploying them into an Amazon S3 bucket
  • Running the video capture client to stream from a built-in laptop webcam or a USB camera
  • Running the video capture client to stream from an IP camera (MJPEG stream)
  • Build a simple web user interface (Web UI)
  • Run a lightweight local HTTP server to serve Web UI for development and demo purposes

For a list of all available tasks, enter the following command in the root directory of this project:

pynt -l

The output represents the list of build commands available to you:

pynt -l output

Build commands are implemented as python scripts in the file build.py. The scripts use the AWS Python SDK (Boto) under the hood. They are documented in the following section.

Prior to using these build commands, you must configure the project. Configuration parameters are split across JSON-formatted files located under the config/ directory. Configuration parameters are described in detail in an earlier section.

Build commands

This section describes important build commands and how to use them. If you want to use these commands right away to build the prototype, you may skip to the section titled "Deploy and run the prototype".

The packagelambda build command

Run this command to package the prototype's AWS Lambda functions and their dependencies (Image Processor and Frame Fetcher) into separate .zip packages (one per function). The deployment packages are created under the build/ directory.

pynt packagelambda # Package both functions and their dependencies into zip files.

pynt packagelambda[framefetcher] # Package only Frame Fetcher.

Currently, only Image Processor requires an external dependency, pytz. If you add features to Image Processor or Frame Fetcher that require external dependencies, you should install the dependencies using Pip by issuing the following command.

pip install <module-name> -t <path-to-project-dir>/lambda/<lambda-function-dir>

For example, let's say you want to perform image processing in the Image Processor Lambda function. You may decide on using the Pillow image processing library. To ensure Pillow is packaged with your Lambda function in one .zip file, issue the following command:

pip install Pillow -t <path-to-project-dir>/lambda/imageprocessor #Install Pillow dependency

You can find more details on installing AWS Lambda dependencies here.

The deploylambda build command

Run this command before you run createstack. The deploylambda command uploads Image Processor and Frame Fetcher .zip packages to Amazon S3 for pickup by AWS CloudFormation while creating the prototype's stack. This command will parse the deployment Amazon S3 bucket name and keys names from the cfn-params.json file. If the bucket does not exist, the script will create it. This bucket must be in the same AWS region as the AWS CloudFormation stack, or else the stack creation will fail. Without parameters, the command will deploy the .zip packages of both Image Processor and Frame Fetcher. You can specify either “imageprocessor” or “framefetcher” as a parameter between square brackets to deploy an individual function.

Here are sample command invocations.

pynt deploylambda # Deploy both functions to Amazon S3.

pynt deploylambda[framefetcher] # Deploy only Frame Fetcher to Amazon S3.

The createstack build command

The createstack command creates the prototype's AWS CloudFormation stack behind the scenes by invoking the create_stack() API. The AWS CloudFormation template used is located at aws-infra/aws-infra-cfn.yaml under the project’s root directory. The prototype's stack requires a number of parameters to be successfully created. The createstack script reads parameters from both global-params.json and cfn-params.json configuration files. The script then passes those parameters to the create_stack() call.

Note that you must, first, package and deploy Image Processor and Frame Fetcher functions to Amazon S3 using the packagelambda and deploylambda commands (documented later in this guid) for the AWS CloudFormation stack creation to succeed.

You can issue the command as follows:

pynt createstack

Stack creation should take only a couple of minutes. At any time, you can check on the prototype's stack status either through the AWS CloudFormation console or by issuing the following command.

pynt stackstatus

Congratulations! You’ve just created the prototype's entire architecture in your AWS account.

The deletestack build command

The deletestack command, once issued, does a few things. First, it empties the Amazon S3 bucket used to store video frame images. Next, it calls the AWS CloudFormation delete_stack() API to delete the prototype's stack from your account. Finally, it removes any unneeded resources not deleted by the stack (for example, the prototype's API Gateway Usage Plan resource).

You can issue the deletestack command as follows.

pynt deletestack

As with createstack, you can monitor the progress of stack deletion using the stackstatus build command.

The deletedata build command

The deletedata command, once issued, empties the Amazon S3 bucket used to store video frame images. Next, it also deletes all items in the DynamoDB table used to store frame metadata.

Use this command to clear all previously ingested video frames and associated metadata. The command will ask for confirmation [Y/N] before proceeding with deletion.

You can issue the deletedata command as follows.

pynt deletedata

The stackstatus build command

The stackstatus command will query AWS CloudFormation for the status of the prototype's stack. This command is most useful for quickly checking that the prototype is up and running (i.e. status is "CREATE_COMPLETE" or "UPDATE_COMPLETE") and ready to serve requests from the Web UI.

You can issue the command as follows.

pynt stackstatus # Get the prototype's Stack Status

The webui build command

Run this command when the prototype's stack has been created (using createstack). The webui command “builds” the Web UI through which you can monitor incoming captured video frames. First, the script copies the webui/ directory verbatim into the project’s build/ directory. Next, the script generates an apigw.js file which contains the API Gateway base URL and the API key to be used by Web UI for invoking the Fetch Frames function deployed in AWS Lambda. This file is created in the Web UI build directory.

You can issue the Web UI build command as follows.

pynt webui

The webuiserver build command

The webuiserver command starts a local, lightweight, Python-based HTTP server on your machine to serve Web UI from the build/web-ui/ directory. Use this command to serve the prototype's Web UI for development and demonstration purposes. You can specify the server’s port as pynt task parameter, between square brackets.

Here’s sample invocation of the command.

pynt webuiserver # Starts lightweight HTTP Server on port 8080.

The videocaptureip and videocapture build commands

The videocaptureip command fires up the MJPEG-based video capture client (source code under the client/ directory). This command accepts, as parameters, an MJPEG stream URL and an optional frame capture rate. The capture rate is defined as 1 every X number of frames. Captured frames are packaged, serialized, and sent to the Kinesis Frame Stream. The video capture client for IP cameras uses Open CV 3 to do simple image processing operations on captured frame images – mainly image rotation.

Here’s a sample command invocation.

pynt videocaptureip["http://192.168.0.2/video",20] # Captures 1 frame every 20.

On the other hand, the videocapture command (without the trailing 'ip'), fires up a video capture client that captures frames from a camera attached to the machine on which it runs. If you run this command on your laptop, for instance, the client will attempt to access its built-in video camera. This video capture client relies on Open CV 3 to capture video from physically connected cameras. Captured frames are packaged, serialized, and sent to the Kinesis Frame Stream.

Here’s a sample invocation.

pynt videocapture[20] # Captures one frame every 20.

Deploy and run the prototype

In this section, we are going use project's build commands to deploy and run the prototype in your AWS account. We’ll use the commands to create the prototype's AWS CloudFormation stack, build and serve the Web UI, and run the Video Cap client.

Prepare your development environment, and ensure configuration parameters are set as you wish.

On your machine, in a command line terminal change into the root directory of the project. Activate your virtual Python environment. Then, enter the following commands:

$ pynt packagelambda #First, package code & configuration files into .zip files

#Command output without errors

$ pynt deploylambda #Second, deploy your lambda code to Amazon S3

#Command output without errors

$ pynt createstack #Now, create the prototype's CloudFormation stack

#Command output without errors

$ pynt webui #Build the Web UI

#Command output without errors
  • On your machine, in a separate command line terminal:
$ pynt webuiserver #Start the Web UI server on port 8080 by default
  • In your browser, access http://localhost:8080 to access the prototype's Web UI. You should see a screen similar to this:

Empty Web UI

Now turn on your IP camera or launch the app on your smartphone. Ensure that your camera is accepting connections for streaming MJPEG video over HTTP, and identify the local URL for accessing that stream.

Then, in a terminal window at the root directory of the project, issue this command:

$ pynt videocaptureip["<your-ip-cam-mjpeg-url>",<capture-rate>]
  • Or, if you don’t have an IP camera and would like to use a built-in camera:
$ pynt videocapture[<frame-capture-rate>]
  • Few seconds after you execute this step, the dashed area in the Web UI will auto-populate with captured frames, side by side with labels recognized in them.

When you are done

After you are done experimenting with the prototype, perform the following steps to avoid unwanted costs.

  • Terminate video capture client(s) (press Ctrl+C in command line terminal where you got it running)
  • Close all open Web UI browser windows or tabs.
  • Execute the pynt deletestack command (see docs above)
  • After you run deletestack, visit the AWS CloudFormation console to double-check the stack is deleted.
  • Ensure that Amazon S3 buckets and objects within them are deleted.

Remember, you can always setup the entire prototype again with a few simple commands.

License

Licensed under the Amazon Software License.

A copy of the License is located at

http://aws.amazon.com/asl/

The AWS CloudFormation Stack (optional read)

Let’s quickly go through the stack that AWS CloudFormation sets up in your account based on the template. AWS CloudFormation uses as much parallelism as possible while creating resources. As a result, some resources may be created in an order different than what I’m going to describe here.

First, AWS CloudFormation creates the IAM roles necessary to allow AWS services to interact with one another. This includes the following.

ImageProcessorLambdaExecutionRole – a role to be assumed by the Image Processor lambda function. It allows full access to Amazon DynamoDB, Amazon S3, Amazon SNS, and AWS CloudWatch Logs. The role also allows read-only access to Amazon Kinesis and Amazon Rekognition. For simplicity, only managed AWS role permission policies are used.

FrameFetcherLambdaExecutionRole – a role to be assumed by the Frame Fetcher lambda function. It allows full access to Amazon S3, Amazon DynamoDB, and AWS CloudWatch Logs. For simplicity, only managed AWS permission policies are used. In parallel, AWS CloudFormation creates the Amazon S3 bucket to be used to store the captured video frame images. It also creates the Kinesis Frame Stream to receive captured video frame images from the Video Cap client.

Next, the Image Processor lambda function is created in addition to an AWS Lambda Event Source Mapping to allow Amazon Kinesis to trigger Image Processor once new captured video frames are available.

The Frame Fetcher lambda function is also created. Frame Fetcher is a simple lambda function that responds to a GET request by returning the latest list of frames, in descending order by processing timestamp, up to a configurable number of hours, called the “fetch horizon” (check the framefetcher-params.json file for more run-time configuration parameters). Necessary AWS Lambda Permissions are also created to permit Amazon API Gateway to invoke the Frame Fetcher lambda function.

AWS CloudFormation also creates the DynamoDB table where Enriched Frame metadata is stored by the Image Processor lambda function as described in the architecture overview section of this post. A Global Secondary Index (GSI) is also created; to be used by the Frame Fetcher lambda function in fetching Enriched Frame metadata in descending order by time of capture.

Finally, AWS CloudFormation creates the Amazon API Gateway resources necessary to allow the Web UI to securely invoke the Frame Fetcher lambda function with a GET request to a public API Gateway URL.

The following API Gateway resources are created.

REST API named “RtRekogRestAPI” by default.

An API Gateway resource with a path part set to “enrichedframe” by default.

A GET API Gateway method associated with the “enrichedframe” resource. This method is configured with Lambda proxy integration with the Frame Fetcher lambda function (learn more about AWS API Gateway proxy integration here). The method is also configured such that an API key is required.

An OPTIONS API Gateway method associated with the “enrichedframe” resource. This method’s purpose is to enable Cross-Origin Resource Sharing (CORS). Enabling CORS allows the Web UI to make Ajax requests to the Frame Fetcher API Gateway URL. Note that the Frame Fetcher lambda function must, itself, also return the Access-Control-Allow-Origin CORS header in its HTTP response.

A “development” API Gateway deployment to allow the invocation of the prototype's API over the Internet.

A “development” API Gateway stage for the API deployment along with an API Gateway usage plan named “development-plan” by default.

An API Gateway API key, name “DevApiKey” by default. The key is associated with the “development” stage and “development-plan” usage plan.

All defaults can be overridden in the cfn-params.json configuration file. That’s it for the prototype's AWS CloudFormation stack! This stack was designed primarily for development/demo purposes, especially how the Amazon API Gateway resources are set up.

FAQ

Q: Why is this project titled "amazon-rekognition-video-analyzer" despite the security-focused use case?

A: Although this prototype was conceived to address the security monitoring and alerting use case, you can use the prototype's architecture and code as a starting point to address a wide variety of use cases involving low-latency analysis of live video frames with Amazon Rekognition.

Download Details:
Author: aws-samples
Source Code: https://github.com/aws-samples/amazon-rekognition-video-analyzer
License: View license

#opencv  #python #aws