Computer Vision using Tensorflow

Many are fortunate to have been given the ability to see. From the moment you are born, images of objects and people around you start to register in your memory and as time goes on, humans are able to detect complex emotions and human behaviour, navigate through areas, pick up objects and so much more. People who have sight, rely on this sense for almost every action throughout their day. Coincidentally, you are currently using your sight to read this very article.

What if we were able to give computers this ability?

In the 21st century, movies like the Terminator, Iron Man, Avengers, and other sci-fi movies, easily demonstrate how machines (“robots”) are able to take on many characteristics of a human. Yet this is only in movies! In the real world, humans have not achieved that level of complexity in technological advancements where they can create a robot or machine that is able to carry out all responsibilities that a human can. However, it won’t take long for our world to be inhabited by fully functioning robots.

Deep Learning is a field that is related to Machine Learning. It uses Artificial Neural Networks (ANN) to train a single tasked model (a robot that only does one thing) to come up with solutions for a problem. The idea of computers being able to see and detect features of an object and then classify it as a certain item or face is known as **Computer Vision. **One example of how Deep Learning is transforming Computer Vision is facial recognition. The face comes into our NN in a form of images, pixels and video and through a series of layers, it can train our model to output a certain face. Security systems, use this output to compare it to their database and can check all the information about this person if necessary. So undoubtedly, computer vision is a very useful tool in practice.

What is a Neural Network?

So far, we have been mentioning Neural Networks, but what exactly is an Artificial Neural Network? The idea of ANN’s was based on the Neurological System in our brain. The way our brain uses a series of interconnected Neurons to create a Neural Network in our brain to do all sorts of things was the motivation for Artificial Neural Networks. In Deep Learning, the simplest NN compromises of a single neuron. NN can be split up into 3 layers; we have the **Input Layer, Hidden Layer, **and the **Output Layer. **The hidden layers are all the layers that are in between the Input and Output layer “in which the function applies weights to the inputs and directs them through an activation function as the output. In short, the hidden layers perform nonlinear transformations of the inputs entered into the **network” **(DeepAI). The output can consist of one or multiple neurons. By mimicking how the brain operates, through a series of algorithms, underlying relationships between the data are found to make a precise estimate of our output.

Setting up a Neural Network:

There are a few general steps when creating a neural network which should be the basis for any problem. These steps are fundamental to understanding how NN’s work and the different problems at hand which require tweaking of the NN.

1. Identifying the Problem:

What we want to get out of this article or code is to be able to feed any picture to our model, and through neural networks, it should classify what type of item it is.

2. Importing Necessary Libraries

import tensorflow as tf
import numpy as np
from tensorflow import keras

Tensorflow is a free open-source platform for creating algorithms in Machine Learning. Tensorflow really aids in the building of Neural Networks to solve complex tasks efficiently.
Numpy is a library which includes concepts from Linear Algebra such as arrays and matrices which will make sense later on
Keras is a great API provided by Tensorflow and is one of the leading high-level neural networks APIs. It is written in Python and supports multiple back-end neural network computation engines.

3. Acquiring Data

The Data was provided in the MNIST dataset which can be accessed by the Keras API in Tensorflow. Fashion-MNIST is a dataset of Zalando’s article images — consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes. Zalando intends Fashion-MNIST to serve as a direct drop-in replacement for the original MNIST dataset for benchmarking machine learning algorithms. It shares the same image size and structure of training and testing splits.

Each image in our data is 28 pixels in height and 28 pixels in width, for a total of 784 pixels in total. Each pixel has a single pixel-value associated with it, indicating the lightness or darkness of that pixel, with higher numbers meaning darker. This pixel-value is an integer between 0 and 255. The training and test data sets have 785 columns. The first column consists of the class labels (see below ), and represents the article of clothing. The rest of the columns contain the pixel-values of the associated image. Importing our Numpy library would help us tremendously while working with these 28x28 arrays.

Labels of our Items

0: T-shirt/top
1: Trouser
2: Pullover
3: Dress
4: Coat
5: Sandal
6: Shirt
7: Sneaker
8: Bag
9: Ankle boot

#data-science #neural-networks #deep-learning #tensorflow #machine-learning