Introduction to Multilayer Neural Networks with TensorFlow’s Keras API

Introduction to Multilayer Neural Networks with TensorFlow’s Keras API

Introduction to Multilayer Neural Networks with TensorFlow’s Keras API - Learn how to build and train a multilayer perceptron using TensorFlow’s high-level API Keras!

Introduction to Multilayer Neural Networks with TensorFlow’s Keras API - Learn how to build and train a multilayer perceptron using TensorFlow’s high-level API Keras!

The development of Keras started in early 2015. As of today, it has evolved into one of the most popular and widely used libraries built on top of Theanoand TensorFlow. One of its prominent features is that it has a very intuitive and user-friendly API, which allows us to implement neural networks in only a few lines of code.

Keras is also integrated into TensorFlow from version 1.1.0. It is part of the contrib module (which contains packages developed by contributors to TensorFlow and is considered experimental code).

In this tutorial we will look at this high-level TensorFlow API by walking through:

  • The basics of feedforward neural networks
  • Loading and preparing the popular MNIST dataset
  • Building an image classifier
  • Train a neural network and evaluate its accuracy

Let’s get started!

This tutorial is adapted from Part 4 of Next Tech’s Python Machine Learningseries, which takes you through machine learning and deep learning algorithms with Python from 0 to 100. It includes an in-browser sandboxed environment with all the necessary software and libraries pre-installed, and projects using public datasets. You can get started for free here!

Multilayer Perceptrons

Multilayer feedforward neural networks are a special type of fully connectednetwork with multiple single neurons. They are also called Multilayer Perceptrons (MLP). The following figure illustrates the concept of an MLP consisting of three layers:

The MLP depicted in the preceding figure has one input layer, one hidden layer, and one output layer. The units in the hidden layer are fully connected to the input layer, and the output layer is fully connected to the hidden layer. If such a network has more than one hidden layer, we also call it a deep artificial neural network.

We can add an arbitrary number of hidden layers to the MLP to create deeper network architectures. Practically, we can think of the number of layers and units in a neural network as additional hyperparameters that we want to optimize for a given problem task.

As shown in the preceding figure, we denote the ith activation unit in the ith layer as *a_i^(l). *To make the math and code implementations a bit more intuitive, we will use the in superscript for the input layer, the h superscript for the hidden layer, and the o superscript for the output layer.

For instance, *a_i^(in)*​ refers to the ith value in the input layer, *a_i^(h)*​ refers to the ith unit in the hidden layer, and *a_i^(out)*​ refers to the ith unit in the output layer. Here, the activation units a_0^(in) and *a_0^(out)*​ are the bias units, which we set equal to 1. The activation of the units in the input layer is just its input plus the bias unit:

Each unit in layer l is connected to all units in layer l + 1 via a weight coefficient. For example, the connection between the kth unit in layer l to the jth unit in layer l + 1 will be written as *w_{k, j}^(l)*​. Referring back to the previous figure, we denote the weight matrix that connects the input to the hidden layer as *W^(h)*​, and we write the matrix that connects the hidden layer to the output layer as *W^(out)*​.

We summarize the weights that connect the input and hidden layers by a matrix:

where d is the number of hidden units and m is the number of input units including the bias unit. Since it is important to internalize this notation to follow the concepts later in this tutorial, let’s summarize what we have just learned in a descriptive illustration of a simplified 3–4–3 multilayer perceptron:

The MNIST dataset

To see what neural network training via the tensorflow.keras (tf.keras) high-level API looks like, let's implement a multilayer perceptron to classify the handwritten digits from the popular Mixed National Institute of Standards and Technology (MNIST) dataset that serves as a popular benchmark dataset for machine learning algorithm.

To follow along with the code snippets in this tutorial, you can use this Next Tech sandbox, which has the MNIST dataset and all necessary packages installed. Otherwise, you can use your local environment and download the dataset here.

The MNIST dataset in four parts, as listed here:

  • Training set images: train-images-idx3-ubyte.gz — 60,000 samples
  • Training set labels: train-labels-idx1-ubyte.gz — 60,000 labels
  • Test set images: t10k-images-idx3-ubyte.gz — 10,000 samples
  • Test set labels: t10k-labels-idx1-ubyte.gz — 10,000 labels

The training set consists of handwritten digits from 250 different people (50% high school students, 50% employees from the Census Bureau). The test set contains handwritten digits from different people.

Note that TensorFlow also provides the same dataset as follows:

	import tensorflow as tf
	from tensorflow.examples.tutorials.mnist import input_data

However, we will work with the MNIST dataset as an external dataset to learn all the steps of data preprocessing separately. This way, you learn what you need to do with your own dataset.

The first step is to unzip the four parts of the MNIST dataset by running the following commands in your Terminal:

cd mnist/
gzip *ubyte.gz -d

import os
	import struct
	 
	 
	def load_mnist(path, kind='train'):
	    """Load MNIST data from `path`"""
	    labels_path = os.path.join(
	        path, f'{kind}-labels-idx1-ubyte'
	    )
	    images_path = os.path.join(
	        path, f'{kind}-images-idx3-ubyte'
	    )
	        
	    with open(labels_path, 'rb') as lbpath:
	        magic, n = struct.unpack('>II', lbpath.read(8))
	        labels = np.fromfile(lbpath, dtype=np.uint8)
	 
	    with open(images_path, 'rb') as imgpath:
	        magic, num, rows, cols = struct.unpack(">IIII", imgpath.read(16))
	        images = np.fromfile(imgpath, dtype=np.uint8).reshape(len(labels), 784)
	        images = ((images / 255.) - .5) * 2
	 
	    return images, labels

The load_mnist function returns two arrays, the first being an n x mdimensional NumPy array (images), where n is the number of samples and mis the number of features (here, pixels). The images in the MNIST dataset consist of 28 x 28 pixels, and each pixel is represented by a gray scale intensity value. Here, we unroll the 28 x 28 pixels into one-dimensional row vectors, which represent the rows in our images array (784 per row or image). The second array (labels) returned by the load_mnist function contains the corresponding target variable, the class labels (integers 0-9) of the handwritten digits.

Then, the dataset is loaded and prepared as follows:

# loading the data
	X_train, y_train = load_mnist('./mnist/', kind='train')
	print(f'Rows: {X_train.shape[0]},  Columns: {X_train.shape[1]}')
	

	X_test, y_test = load_mnist('./mnist/', kind='t10k')
	print(f'Rows: {X_test.shape[0]},  Columns: {X_test.shape[1]}')
	

	# mean centering and normalization:
	mean_vals = np.mean(X_train, axis=0)
	std_val = np.std(X_train)
	

	X_train_centered = (X_train - mean_vals)/std_val
	X_test_centered = (X_test - mean_vals)/std_val
	

	del X_train, X_test
	

	print(X_train_centered.shape, y_train.shape)
	print(X_test_centered.shape, y_test.shape)

[Out:]
 Rows: 60000,  Columns: 784
 Rows: 10000,  Columns: 784
 (60000, 784) (60000,)
 (10000, 784) (10000,)

To get an idea of how those images in MNIST look, let’s visualize examples of the digits 0–9 via Matplotlib’s imshowfunction:

import matplotlib.pyplot as plt
	

	fig, ax = plt.subplots(nrows=2, ncols=5,
	                       sharex=True, sharey=True)
	ax = ax.flatten()
	for i in range(10):
	    img = X_train_centered[y_train == i][0].reshape(28, 28)
	    ax[i].imshow(img, cmap='Greys')
	

	ax[0].set_yticks([])
	ax[0].set_xticks([])
	plt.tight_layout()
	plt.show()

We should now see a plot of the 2 x 5 subfigures showing a representative image of each unique digit:

Now let’s start building our model!

Building an MLP using TensorFlow’s Keras API

First, let’s set the random seed for NumPy and TensorFlow so that we get consistent results:

import tensorflow.contrib.keras as keras
	
    np.random.seed(123)
	tf.set_random_seed(123)

To continue with the preparation of the training data, we need to convert the class labels (integers 0–9) into the one-hot format. Fortunately, Kerasprovides a convenient tool for this:

y_train_onehot = keras.utils.to_categorical(y_train)
	
print('First 3 labels: ', y_train[:3])
print('\nFirst 3 labels (one-hot):\n', y_train_onehot[:3])

First 3 labels:  [5 0 4]
First 3 labels (one-hot):
 [[0. 0. 0. 0. 0. 1. 0. 0. 0. 0.]
 [1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]]

Now, let’s implement our neural network! Briefly, we will have three layers, where the first two layers (the input and hidden layers) each have 50 units with the tanh activation function and the last layer (the output layer) has 10 layers for the 10 class labels and uses softmax to give the probability of each class. Keras makes these tasks very simple:

# initialize model
	model = keras.models.Sequential()
	

	# add input layer
	model.add(keras.layers.Dense(
	    units=50,
	    input_dim=X_train_centered.shape[1],
	    kernel_initializer='glorot_uniform',
	    bias_initializer='zeros',
	    activation='tanh') 
	)
	# add hidden layer
	model.add(
	    keras.layers.Dense(
	        units=50,
	        input_dim=50,
	        kernel_initializer='glorot_uniform',
	        bias_initializer='zeros',
	        activation='tanh')
	    )
	# add output layer
	model.add(
	    keras.layers.Dense(
	        units=y_train_onehot.shape[1],
	        input_dim=50,
	        kernel_initializer='glorot_uniform',
	        bias_initializer='zeros',
	        activation='softmax')
	    )
	

	# define SGD optimizer
	sgd_optimizer = keras.optimizers.SGD(
	    lr=0.001, decay=1e-7, momentum=0.9
	)
	# compile model
	model.compile(
	    optimizer=sgd_optimizer,
	    loss='categorical_crossentropy'
	)

First, we initialize a new model using the Sequential class to implement a feedforward neural network. Then, we can add as many layers to it as we like. However, since the first layer that we add is the input layer, we have to make sure that the input_dim attribute matches the number of features (columns) in the training set (784 features or pixels in the neural network implementation).

Also, we have to make sure that the number of output units (units) and input units (input_dim) of two consecutive layers match. Our first two layers have 50 units plus one bias unit each. The number of units in the output layer should be equal to the number of unique class labels — the number of columns in the one-hot-encoded class label array.

Note that we used glorot_uniform to as the initialization algorithm for weight matrices. Glorot initialization is a more robust way of initialization for deep neural networks. The biases are initialized to zero, which is more common, and in fact the default setting in Keras.

Before we can compile our model, we also have to define an optimizer. We chose a stochastic gradient descent optimization. Furthermore, we can set values for the weight decay constant and momentum learning to adjust the learning rate at each epoch. Lastly, we set the cost (or loss) function to categorical_crossentropy.

The binary cross-entropy is just a technical term for the cost function in the logistic regression, and the categorical cross-entropy is its generalization for multiclass predictions via softmax.

After compiling the model, we can now train it by calling the fit method. Here, we are using mini-batch stochastic gradient with a batch size of 64 training samples per batch. We train the MLP over 50 epochs, and we can follow the optimization of the cost function during training by setting verbose=1.

The validation_split parameter is especially handy since it will reserve 10% of the training data (here, 6,000 samples) for validation after each epoch so that we can monitor whether the model is overfitting during training:

# train model
	history = model.fit(
	    X_train_centered, y_train_onehot,
	    batch_size=64, epochs=50,
	    verbose=1, validation_split=0.1
	)

Printing the value of the cost function is extremely useful during training to quickly spot whether the cost is decreasing during training and stop the algorithm earlier. Otherwise, hyperparameter values will need to be tuned.

To predict the class labels, we can then use the predict_classes method to return the class labels directly as integers:

y_train_pred = model.predict_classes(X_train_centered, verbose=0)
print('First 3 predictions: ', y_train_pred[:3])

First 3 predictions: [5 0 4]

Finally, let’s print the model accuracy on training and test sets:

# calculate training accuracy
	y_train_pred = model.predict_classes(X_train_centered, verbose=0)
	correct_preds = np.sum(y_train == y_train_pred, axis=0)
	train_acc = correct_preds / y_train.shape[0]
	

	print(f'Training accuracy: {(train_acc * 100):.2f}')
	

	# calculate testing accuracy
	y_test_pred = model.predict_classes(X_test_centered, verbose=0)
	correct_preds = np.sum(y_test == y_test_pred, axis=0)
	test_acc = correct_preds / y_test.shape[0]
	

	print(f'Test accuracy: {(test_acc * 100):.2f}')

Training accuracy: 98.81
Test accuracy: 96.27

I hope you enjoyed this tutorial on using TensorFlow's keras API to build and train a multilayered neural network for image classification! Note that this is just a very simple neural network without optimized tuning parameters.

In practice you need to know how to optimize the model by tweaking learning rate, momentum, weight decay, and number of hidden units. You also need to learn how to deal with the vanishing gradient problem, wherein error gradients become increasingly small as more layers are added to a network.

We cover these topics in Next Tech’s Python Machine Learning (Part 4) course, as well as:

  • Breaking down the mechanics of <em>TensorFlow</em>, such as tensors, activation functions computation graphs, variables, and placeholders
  • *Low-level <em>TensorFlow</em> and another high-level API, *<em>Layers</em>
  • Modeling sequential data using recurrent neural networks (RNN) and long short-term memory (LSTM) networks
  • Classifying images with deep convolutional neural networks (CNN).

You can get started here for free!

Top Machine Learning Framework: 5 Machine Learning Frameworks of 2019

Top Machine Learning Framework: 5 Machine Learning Frameworks of 2019

Machine Learning (ML) is one of the fastest-growing technologies today. ML has a lot of frameworks to build a successful app, and so as a developer, you might be getting confused about using the right framework. Herein we have curated top 5...

Machine Learning (ML) is one of the fastest-growing technologies today. ML has a lot of frameworks to build a successful app, and so as a developer, you might be getting confused about using the right framework. Herein we have curated top 5 machine learning frameworks that are cutting edge technology in your hands.

Through the machine learning frameworks, mobile phones and tablets are getting powerful enough to run the software that can learn and react in real-time. It is a complex discipline. But the implementation of ML models is far less daunting and difficult than it used to be. Now, it automatically improves the performance with the pace of time, interactions, and experiences, and the most important acquisition of useful data pertaining to the tasks allocated.

As we know that ML is considered as a subset of Artificial Intelligence (AI). The scientific study of statistical models and algorithms help a computing system to accomplish designated tasks efficiently. Now, as a mobile app developer, when you are planning to choose machine learning frameworks you must keep the following things in mind.

The framework should be performance-oriented
The grasping and coding should be quick
It allows to distribute the computational process, the framework must have parallelization
It should consist of a facility to create models and provide a developer-friendly tool
Let’s learn about the top five machine learning frameworks to make the right choice for your next ML application development project. Before we dive deeper into these mentioned frameworks, know the different types of ML frameworks that are available on the web. Here are some ML frameworks:

Mathematical oriented
Neural networks-based
Linear algebra tools
Statistical tools
Now, let’s have an insight into ML frameworks that will help you in selecting the right framework for your ML application.

Don’t Miss Out on These 5 Machine Learning Frameworks of 2019
#1 TensorFlow
TensorFlow is an open-source software library for data-based programming across multiple tasks. The framework is based on computational graphs which is essentially a network of codes. Each node represents a mathematical operation that runs some function as simple or as complex as multivariate analysis. This framework is said to be best among all the ML libraries as it supports regressions, classifications, and neural networks like complicated tasks and algorithms.

machine learning frameworks
This machine learning library demands additional efforts while learning TensorFlow Python framework. Your job becomes easy in the n-dimensional array of the framework when you have grasped the Python frameworks and libraries.

The benefits of this framework are flexibility. TensorFlow allows non-automatic migration to newer versions. It runs on the GPU, CPU, servers, desktops, and mobile devices. It provides auto differentiation and performance. There are a few goliaths like Airbus, Twitter, IBM, who have innovatively used the TensorFlow frameworks.

#2 FireBase ML Kit
Firebase machine learning framework is a library that allows effortless, minimal code, with highly accurate, pre-trained deep models. We at Space-O Technologies use this machine learning technology for image classification and object detection. The Firebase framework offers models both locally and on the Google Cloud.

machine learning frameworks
This is one of our ML tutorials to make you understand the Firebase frameworks. First of all, we collected photos of empty glass, half watered glass, full watered glass, and targeted into the machine learning algorithms. This helped the machine to search and analyze according to the nature, behavior, and patterns of the object placed in front of it.

The first photo that we targeted through machine learning algorithms was to recognize an empty glass. Thus, the app did its analysis and search for the correct answer, we provided it with certain empty glass images prior to the experiment.
The other photo that we targeted was a half water glass. The core of the machine learning app is to assemble data and to manage it as per its analysis. It was able to recognize the image accurately because of the little bits and pieces of the glass given to it beforehand.
The last one is a full glass recognition image.
Note: For correct recognition, there has to be 1 label that carries at least 100 images of a particular object.

#3 CAFFE (Convolutional Architecture for Fast Feature Embedding)
CAFFE framework is the fastest way to apply deep neural networks. It is the best machine learning framework known for its model-Zoo a pre-trained ML model that is capable of performing a great variety of tasks. Image classification, machine vision, recommender system are some of the tasks performed easily through this ML library.

machine learning frameworks
This framework is majorly written in CPP. It can run on multiple hardware and can switch between CPU and GPU with the use of a single flag. It has systematically organized the structure of Mat lab and python interface.

Now, if you have to make a machine learning app development, then it is mainly used in academic research projects and to design startups prototypes. It is the aptest machine learning technology for research experiments and industry deployment. At a time this framework can manage 60 million pictures every day with a solitary Nvidia K40 GPU.

#4 Apache Spark
The Apache Spark machine learning is a cluster-computing framework written in different languages like Java, Scala, R, and Python. Spark’s machine learning library, MLlib is considered as foundational for the Spark’s success. Building MLlib on top of Spark makes it possible to tackle the distinct needs of a single tool instead of many disjointed ones.

machine learning frameworks
The advantages of such ML library lower learning curves, less complex development and production environments, which ultimately results in a shorter time to deliver high-performing models. The key benefit of MLlib is that it allows data scientists to solve multiple data problems in addition to their machine learning problems.

It can easily solve graph computations (via GraphX), streaming (real-time calculations), and real-time interactive query processing with Spark SQL and DataFrames. The data professionals can focus on solving the data problems instead of learning and maintaining a different tool for each scenario.

#5 Scikit-Learn
Scikit-learn is said to be one of the greatest feats of Python community. This machine learning framework efficiently handles data mining and supports multiple practical tasks. It is built on foundations like SciPy, Numpy, and matplotlib. This framework is known for supervised & unsupervised learning algorithms as well as cross-validation. The Scikit learn is largely written in Python with some core algorithms in Cython to achieve performance.

machine learning frameworks
The machine learning framework can work on multiple tasks without compromising on speed. There are some remarkable machine learning apps using this framework like Spotify, Evernote, AWeber, Inria.

With the help of machine learning to build iOS apps, Android apps powered by ML have become quite an easy process. With this emerging technology trend varieties of available data, computational processing has become cheaper and more powerful, and affordable data storage. So being an app developer or having an idea for machine learning apps should definitely dive into the niche.

Conclusion
Still have any query or confusion regarding ML frameworks, machine learning app development guide, the difference between Artificial Intelligence and machine learning, ML algorithms from scratch, how this technology is helpful for your business? Just fill our contact us form. Our sales representatives will get back to you shortly and resolve your queries. The consultation is absolutely free of cost.

Author Bio: This blog is written with the help of Jigar Mistry, who has over 13 years of experience in the web and mobile app development industry. He has guided to develop over 200 mobile apps and has special expertise in different mobile app categories like Uber like apps, Health and Fitness apps, On-Demand apps and Machine Learning apps. So, we took his help to write this complete guide on machine learning technology and machine app development areas.

Introduction to Machine Learning with TensorFlow.js

Introduction to Machine Learning with TensorFlow.js

Learn how to build and train Neural Networks using the most popular Machine Learning framework for javascript, TensorFlow.js.

Learn how to build and train Neural Networks using the most popular Machine Learning framework for javascript, TensorFlow.js.

This is a practical workshop where you'll learn "hands-on" by building several different applications from scratch using TensorFlow.js.

If you have ever been interested in Machine Learning, if you want to get a taste for what this exciting field has to offer, if you want to be able to talk to other Machine Learning/AI specialists in a language they understand, then this workshop is for you.

Thanks for reading

If you liked this post, share it with all of your programming buddies!

Follow us on Facebook | Twitter

Further reading about Machine Learning and TensorFlow.js

Machine Learning A-Z™: Hands-On Python & R In Data Science

Machine Learning In Node.js With TensorFlow.js

Machine Learning in JavaScript with TensorFlow.js

A Complete Machine Learning Project Walk-Through in Python

Top 10 Machine Learning Algorithms You Should Know to Become a Data Scientist

TensorFlow Vs PyTorch: Comparison of the Machine Learning Libraries

TensorFlow Vs PyTorch: Comparison of the Machine Learning Libraries

Libraries play an important role when developers decide to work in Machine Learning or Deep Learning researches. In this article, we list down 10 comparisons between TensorFlow and PyTorch these two Machine Learning Libraries.

According to this article, a survey based on a sample of 1,616 ML developers and data scientists, for every one developer using PyTorch, there are 3.4 developers using TensorFlow. In this article, we list down 10 comparisons between these two Machine Learning Libraries

1 - Origin

PyTorch has been developed by Facebook which is based on Torch while TensorFlow, an open sourced Machine Learning Library, developed by Google Brain is based on the idea of data flow graphs for building models.

2 - Features

TensorFlow has some attracting features such as TensorBoard which serves as a great option while visualising a Machine Learning model, it also has TensorFlow Serving which is a specific grpc server that is used during the deployment of models in production. On the other hand, PyTorch has several distinguished features too such as dynamic computation graphs, naive support for Python, support for CUDA which ensures less time for running the code and increase in performance.

3 - Community

TensorFlow is adopted by many researchers of various fields like academics, business organisations, etc. It has a much bigger community than PyTorch which implies that it is easier to find for resources or solutions in TensorFlow. There is a vast amount of tutorials, codes, as well as support in TensorFlow and PyTorch, being the newcomer into play as compared to TensorFlow, it lacks these benefits.

4 - Visualisation

Visualisation plays as a protagonist while presenting any project in an organisation. TensorFlow has TensorBoard for visualising Machine Learning models which helps during training the model and spot the errors quickly. It is a real-time representation of the graphs of a model which not only depicts the graphic representation but also shows the accuracy graphs in real-time. This eye-catching feature is lacked by PyTorch.

5 - Defining Computational Graphs

In TensorFlow, defining computational graph is a lengthy process as you have to build and run the computations within sessions. Also, you will have to use other parameters such as placeholders, variable scoping, etc. On the other hand, Python wins this point as it has the dynamic computation graphs which help id building the graphs dynamically. Here, the graph is built at every point of execution and you can manipulate the graph at run-time.

6 - Debugging

PyTorch being the dynamic computational process, the debugging process is a painless method. You can easily use Python debugging tools like pdb or ipdb, etc. for instance, you can put “pdb.set_trace()” at any line of code and then proceed for executions of further computations, pinpoint the cause of the errors, etc. While, for TensorFlow you have to use the TensorFlow debugger tool, tfdbg which lets you view the internal structure and states of running TensorFlow graphs during training and inference.

7 - Deployment

For now, deployment in TensorFlow is much more supportive as compared to PyTorch. It has the advantage of TensorFlow Serving which is a flexible, high-performance serving system for deploying Machine Learning models, designed for production environments. However, in PyTorch, you can use the Microframework for Python, Flask for deployment of models.

8 - Documentation

The documentation of both frameworks is broadly available as there are examples and tutorials in abundance for both the libraries. You can say, it is a tie between both the frameworks.

Click here for TensorFlow documentation and click here for PyTorch documentation.

9 - Serialisation

The serialisation in TensorFlow can be said as one of the advantages for this framework users. Here, you can save your entire graph as a protocol buffer and then later it can be loaded in other supported languages, however, PyTorch lacks this feature. 

10 - Device Management

By default, Tensorflow maps nearly all of the GPU memory of all GPUs visible to the process which is a comedown but here it automatically presumes that you want to run your code on the GPU because of the well-set defaults and thus result in fair management of the device. On the other hand, PyTorch keeps track of the currently selected GPU and all the CUDA tensors which will be allocated.