So I See That You’re Not Wearing a Mask…

Image classification is a field within machine learning in which neural networks are used to perform analysis on image data. The neural network is given an image as an input and produces a classification of the image as an output. This classification is used to identify with the image is. Image classification is used in a wide variety of fields, some of which include facial recognition and medical imaging.

In these trying times that we’re currently going through, we have to do everything we can to fight the Covid-19 crisis. It is with this motivation, and my desire to learn more about data science, that I decided to do what I could to help, no matter how small. So, I decided create an image classification model that can distinguish between a person wearing a mask and a person not wearing one, with the hopes that it can be used to see how well people are staying safe within a population. The full code for this project can be found on my Github.

Data Collection and Processing

The data consists of 4,962 total images of a people either wearing or not wearing a mask. These images were divided into train and test directories, with each directory having a “with mask” and “without mask” subdirectory. The train subdirectories each contain 1,735 images for a total of 3,470 in the train folder, while the test subdirectories have 738 images in the “with mask” subdirectory, and 754 images in the “without mask” directory, giving us a total of 1,492 images in the test folder.

Image for post

Sample training image of a person wearing a mask

The images were gray-scaled and resized so that every image was of the same size before they were analyzed by the model. In this case, the images were resized to be 50 by 50. Each image was giving a label depending on what directory they originated from.

From here, the training data was augmented in order to create more data to train the model on. Augmentation is important in image classification as it creates new images from the original data with slight modifications. This allows us to increase the size of the training data exponentially without having to store a new image file for each new generated image. For this project, the augmentation consisted of flipping the image horizontally and then shifting the images up, down, to the left, and to the right. By also applying these augmentations to already augmented images, the number of training images was increased by a factor of 32, giving us a total of 111,040 training images.

Modeling

Image for post

#neural-networks #machine-learning #python #data-science #keras

Data Collection and Processing

Modeling

towardsdatascience.com

So I See That You’re Not Wearing a Mask…