TensorFlow.js Crash Course — Machine Learning For The Web — Handwriting Recognition. This post has been published first on CodingTheSmartWay.com -TensorFlow.js is a library for building and executing machine learning … and deploy machine learning models and enabled new classes of on-device …
In the first part TensorFlow.js Crash Course – Machine Learning For The Web – Getting Started we’ve covered the following topics:
In this part we’re going one step further and will explore another use case: the recognition of handwritten digits. Therefore it is assumed that you’re familiar with the the basic building blocks of TensorFlow.js which have been introduced in the first episode.
Let’s take a look at the application which we’re going to build in this tutorial. The application will use the MNIST data set train a neural network. The model is being built and trained when the website is loaded. The progress can be seen in the Log Output area:
Once the training procedure is completed the user is informed with the message “Training complete” and the button in the *Predict *area is activated.
Pressing the button randomly selects one dataset fromn the MNIST data source to perform a prediction with the trained model. The output looks like the following:
The image of the handwritten digit is presented, the original value and the predicted value is outputted. If the prediction is correct the text “Value recognized successfully” is visible as well. This shows us, that the trained neural network was able to recognize the digit from the image correctly.
The user is able to use the button multiple times. The output is extended as you can see in the following screenshot:
MNIST is a data set which contains the images of handwritten digits from 0–9. We’ll use that database of images to train the model of our application. Furthermore we’ll make use of randomly selected images from the MINSIT data set to test if the neural network is able to perform predictions.
Again let’s start with setting up the project by creating a new folder:
$ mkdir tfjs02
Change into that newly created project folder:
$ cd tfjs02
Inside the folder we’re now ready to create a package.json file, so that we’re able to manage dependencies by using the Node.js Package Manager:
$ npm init -y
Because we’ll be installing the dependencies (e.g. the Tensorflow.js library) locally in our project folder we need to use a module bundler for our web application. To keep things as easy as possible we’re going to use the Parcel web application bundler because Parcel is working with zero configuration. Let’s install the Parcel bundler by executing the following command in the project directory:
$ npm install -g parcel-bundler
Next, let’s create two new empty files for our implementation:
$ touch index.html index.js
Finally let’s add the Bootstrap library as a dependency because we will be using some Bootstrap CSS classes for our user interface elements:
$ npm install bootstrap
Now let’s add further depependencies to the project to make sure that we’re able to use latest EcmaScript features like async / await:
npm install --save-dev babel-plugin-transform-runtime babel-runtime
Create a .babelrc file and add:
{
"plugins": [
["transform-runtime",
{
"polyfill": false,
"regenerator": true
}]
]
}
Last but not least we do not need to forget to install TensorFlow.js as well:
$ npm install @tensorflow/tfjs
Before we’re starting to built the convolutional neural network model we’re defining a variable *model *which will hold the model and a function *createModel *which will contain the code which is needed to create and compile the machine learning model:
var model; function createModel() { // Insert the following pieces of code here }
Let’s first create the sequential model instance as already learned in episode 1 of this series and insert the following code in function createModel:
createLogEntry('Create model ...');
model = tf.sequential();
createLogEntry('Model created');
Additionally we’re making use of a function named createLogEntry. This function is will be implemented later on and is used to output text messages to the Log Output area.
First, let’s add a two-dimensional convolutional layer by using the following code:
createLogEntry('Add layers ...');
model.add(tf.layers.conv2d({
inputShape: [28, 28, 1],
kernelSize: 5,
filters: 8,
strides: 1,
activation: 'relu',
kernelInitializer: 'VarianceScaling'
}));
The layer is created via* tf.layers.conv2d*. The configure the layer a configuration object is passed as a parameter to this method. The new layer is added to the model by passing it into the call of the method model.add.
The configuration object which is passed to *conv2d *is containing six configuration properties in total:
The next layer we’re going to add to our neural network model is a two dimensional max pooling layer. We’re using that layer to down-sample the image so it is half the size of the input from the previous layer by defining the max pooling layer in the following way:
model.add(tf.layers.maxPooling2d({
poolSize: [2, 2],
strides: [2, 2]
}));
The layer is configured by passing over a configuration object with two configuration properties:
Since both values are set to 2,2, the pooling windows is completely non-overlapping. As a result this will cut the size of the input from the previous layer in half.
A common pattern in convolutional neural network models used for image recognition is to repeat the first convolutional layer and the second max pooling layer. So let’s add again a two dimensional convolutional layer as the third layer in our model:
model.add(tf.layers.conv2d({
kernelSize: 5,
filters: 16,
strides: 1,
activation: 'relu',
kernelInitializer: 'VarianceScaling'
}));
This time we do not need to define the input shape because the shape is determined by the output shape of the previous layer automatically.
The fourth layer is again a max pooling layer to further down-sample the result:
model.add(tf.layers.maxPooling2d({
poolSize: [2, 2],
strides: [2, 2]
}));
Having repeated the pattern of a convolutional layer and a max pooling layer a second time brings us now to the point to add a flatten layer as the fifth layer in our model:
model.add(tf.layers.flatten());
This layer will flatten the output from the previous layer to a vector.
The final layer which is added to out model is a dense layer (fully connected layer). This layer will perform the final classification:
model.add(tf.layers.dense({
units: 10,
kernelInitializer: 'VarianceScaling',
activation: 'softmax'
}));
createLogEntry('Layers created');
The dense layer configuration consists of the following properties:
All needed layers have been added to the model. Before we’re going to train the model with MNIST data sets we need to make sure that the model is compiled:
createLogEntry('Start compiling ...');
model.compile({
optimizer: tf.train.sgd(0.15),
loss: 'categoricalCrossentropy'
});
createLogEntry('Compiled');
The object which is passed to the call of model.compile is containing two properties:
Let’s start training the model with MNIST data sets of handwritten digits. To access the MNIST data from a remote server we’re using the MnistData class from the project https://github.com/tensorflow/tfjs-examples/tree/master/mnist. To make that class available just download the file data.js from that repository and insert that file in our project directory. In index.js use the following import statement to make the MnistData class available:
import {MnistData} from './data';
The data should be kept in a variable named data. A load function is added to our application to load the data by calling the *MnistData *method load:
let data;
async function load() {
createLogEntry('Loading MNIST data ...');
data = new MnistData();
await data.load();
createLogEntry('Data loaded successfully');
}
With the MNIST data records available we’re now ready to prepare for training. Let’s first define two constants:
const BATCH_SIZE = 64;
const TRAIN_BATCHES = 150;
The training will not be performed in one operation. Instead we’ll perform the training in batches of data. The size of the batch and the number of batches to be trained is defined by those constants. The training logic is encapsulated in function train:
async function train() {
createLogEntry('Start training ...');
for (let i = 0; i < TRAIN_BATCHES; i++) {
const batch = tf.tidy(() => {
const batch = data.nextTrainBatch(BATCH_SIZE);
batch.xs = batch.xs.reshape([BATCH_SIZE, 28, 28, 1]);
return batch;
});
await model.fit(
batch.xs, batch.labels, {batchSize: BATCH_SIZE, epochs: 1}
);
tf.dispose(batch);
await tf.nextFrame();
}
createLogEntry('Training complete');
}
In the next step let’s add the HTML / CSS code which is needed to implement the user interface of our application in index.html:
<html>
<body>
<style>
.prediction-canvas{
width: 100px;
margin: 20px;
}
.prediction-div{
display: inline-block;
margin: 10px;
}
</style>
<div class="container" style="padding-top: 20px">
<div class="card">
<div class="card-header">
<strong>TensorFlow.js Demo - Handwriting Recognition</strong>
</div>
<div class="card-body">
<div class="card">
<div class="card-body">
<h5 class="card-title">Log Output:</h5>
<div id="log"></div>
</div>
</div>
<br>
<div class="card">
<div class="card-body">
<h5 class="card-title">Predict</h5>
<button type="button" class="btn btn-primary" id="selectTestDataButton" disabled>Please wait until model is ready ...</button>
<div id="predictionResult"></div>
</div>
</div>
</div>
</div>
</div>
<script src="./index.js"></script>
</body>
</html>
Suggest
Here we’re making use of various Bootstrap CSS classes.
For the output which is written to the log output area a
The output area for the prediction result is the
The createLogEntry function has already been used several times to output messages in the log area. Now let’s add the missing implementation of that function in index.js as well:
function createLogEntry(entry) {
document.getElementById('log').innerHTML += '<br>' + entry;
}
Finally let’s bring everything in order and implement function main to call createModel, load and train.
async function main() {
createModel();
await load();
await train();
document.getElementById('selectTestDataButton').disabled = false;
document.getElementById('selectTestDataButton').innerText = "Randomly Select Test Data And Predict";
}
main();
Furthermore we’re making sure that the button is enabled after the training procedure has been performed successfully.
Let’s move on to the final task and add the code which is needed to perform the predict based on our trained convolutional neural network. Therefore we’re adding the *predict *function in the following way:
async function predict(batch) {
tf.tidy(() => {
const input_value = Array.from(batch.labels.argMax(1).dataSync());
const div = document.createElement('div');
div.className = 'prediction-div';
const output = model.predict(batch.xs.reshape([-1, 28, 28, 1]));
const prediction_value = Array.from(output.argMax(1).dataSync());
const image = batch.xs.slice([0,0], [1, batch.xs.shape[1]]);
const canvas = document.createElement('canvas');
canvas.className = 'prediction-canvas';
draw(image.flatten(), canvas);
const label = document.createElement('div');
label.innerHTML = 'Original Value: ' + input_value;
label.innerHTML += '<br>Prediction Value: ' + prediction_value;
console.log(prediction_value + '-' + input_value);
if (prediction_value - input_value == 0) {
label.innerHTML += '<br>Value recognized successfully!';
} else {
label.innerHTML += '<br>Recognition failed!';
}
div.appendChild(canvas);
div.appendChild(label);
document.getElementById('predictionResult').appendChild(div);
});
}
Part of the output is the image of the handwritten digit. The draw the image we’re making use of the custom draw function. The implementation of that function needs to be added to index.js as well:
function draw(image, canvas) {
const [width, height] = [28, 28];
canvas.width = width;
canvas.height = height;
const ctx = canvas.getContext('2d');
const imageData = new ImageData(width, height);
const data = image.dataSync();
for (let i = 0; i < height * width; ++i) {
const j = i * 4;
imageData.data[j + 0] = data[i] * 255;
imageData.data[j + 1] = data[i] * 255;
imageData.data[j + 2] = data[i] * 255;
imageData.data[j + 3] = 255;
}
ctx.putImageData(imageData, 0, 0);
}
Finally we need to add the click event handler function for the selectTestDataButton:
document.getElementById('selectTestDataButton').addEventListener('click', async (el, ev) => {
const batch = data.nextTestBatch(1);
await predict(batch);
});
Inside this function we’re using method nextTestBatch from the MnistData class to retrieve a batch of test data of size 1 (which means that only one data set is included). Next we’re calling the asynchronous predict function by using the keyword await and passing of the test data set.
#tensorflow #machine-learning