MNIST dataset has is known as “Hello world” of Image classification. Every Machine Learning Engineer tackles this dataset sooner or later.

Dataset

MNIST is a set of small images of handwritten digits. Look at the below image which has a few examples instances.

Image for post

MNIST data set

There are 70,000 images and each image has 784 features. This is because each image is 28 x 28 pixels, and each feature represents a pixel’s intensity, from 0 to 255.

There are many classification algorithms( SGD, SVM, RandomForest, etc) which can be trained on this dataset including deep learning algorithms (CNN).

Training and Evaluating

Let’s take an example of RandomForest Classifier and train it on the above dataset and evaluate it.

#scikit-learn #image-classification #classification #mnist #data analysis

Improving accuracy on MNIST using Data Augmentation
26.95 GEEK